Stationarity of Manifold Time Series
Junhao Zhu*, Dehan Kong†, Zhaolei Zhang†,
University of Toronto,
and
Zhenhua Lin‡
National University of Singapore
Abstract
In modern interdisciplinary research, manifold time series data have been garnering
more attention. A critical question in analyzing such data is “stationarity”, which reflects the
underlying dynamic behavior and is crucial across various fields like cell biology, neuroscience
and empirical finance. Yet, there has been an absence of a formal definition of stationarity that
is tailored to manifold time series. This work bridges this gap by proposing the first definitions
of first-order and second-order stationarity for manifold time series. Additionally, we develop
novel statistical procedures to test the stationarity of manifold time series and study their
asymptotic properties. Our methods account for the curved nature of manifolds, leading to
a more intricate analysis than that in Euclidean space. The effectiveness of our methods is
evaluated through numerical simulations and their practical merits are demonstrated through
analyzing a cell-type proportion time series dataset from a paper recently published in Cell.
The first-order stationarity test result aligns with the biological findings of this paper, while
the second-order stationarity test provides numerical support for a critical assumption made
therein.
Keywords: bootstrap, CUSUM, curvature, spectral density, sphere.
*JZ is partially supported by CANSSI (Canadian Statistical Sciences Institute), Data Science Institute and Medicine
by Design, University of Toronto
†DK and ZZ acknowledge financial support from a Catalyst Grant from Data Science Institute and Medicine by
Design, University of Toronto.
‡ZL research is partially supported by the NUS startup grant A-0004816-00-00
1
arXiv:2409.17706v1  [stat.ME]  26 Sep 2024

1
Introduction
Recent advances of scientific research introduce various complex data; a notable category among
these is manifold time series, which refer to temporal data with values residing on manifolds.
Central to the exploration of these datasets is a crucial question: is a manifold time series
“stationary”? This inquiry is vital for a thorough understanding of the data’s dynamic nature and
its implications in the broader context of the study.
For example, in cell biology, the pioneering study by Schiebinger et al. (2019) introduced
Waddington Optimal Transport (WOT) for investigating cellular developmental paths and transitions
between cell types by tracking changes in cell-type proportions over time. These proportions,
represented on a unit sphere (e.g., Scealy & Welsh 2011), form a spherical time series. Stationarity
in this context reflects dynamic equilibrium in cellular development, such as stable populations
in stem cell differentiation. The relevance of “stationarity” in manifold time series (here, the
unit sphere) to WOT emerges in two key ways. First, WOT seeks to capture the evolving trend
in a spherical time series of non-stationary cell-type proportions, yet it lacks a formal method
to distinguish genuine non-stationarity from random fluctuations. Secondly, WOT implicitly
presumes the constancy of randomness from cellular proliferation and apoptosis or sequencing
platform technical noises over time, without thorough statistical justification. These aspects relate
to first- and second-order stationarity in manifold time series.
As another example, in neuroscience, there is a growing interest in modelling time series with
values in the manifold of symmetric positive definite (SPD) matrices to study dynamic resting
state functional connectivity and to reveal the fundamental mechanisms underlying brain networks
(Yang et al. 2020). Typically, one interesting question is to determine the “stationarity” of the
SPD-matrices-valued manifold time series. Scientists are interested in whether the observed
temporal fluctuation in functional connectivity values reflects a reliable “non-stationarity”, or
2

merely attributes to noise and statistical uncertainty.
The above examples show that determining/testing “stationarity” of manifold time series
is pivotal for advancing our knowledge in these complex biological fields. The concept of
“stationarity” in manifold time series is not restricted to biological studies. In empirical finance, an
important question is to determine whether the correlation matrices of returns, as a time series
residing in a sub-manifold of SPD matrices, undergoes some systematic shift over time (Wied et al.
2012). Several promising results from spherical or general non-Euclidean time series analysis
were proposed. Fisher & Lee (1994) and Zhu & Müller (2024) mainly focus on estimation of
auto-regressive models in sphere-valued time series. Dubey & Müller (2020), Wang et al. (2023)
and Jiang et al. (2024) investigated change-point detection in non-Euclidean data, assuming time
series are segmented into blocks with constant mean and variance. However, their methods
did not address more general forms of weak stationarity or account for continuous underlying
dynamics in the time series. van Delft & Blumberg (2024) explored testing for strong stationarity
in time-varying metric measure spaces, where each data point in the time series is a metric space
instead of a point within a given manifold. A visible gap remains: none of the existing works have
formally defined the concept of first and second-order stationarity for manifold time series. The
existing weak stationarity definition and testing methods (Zhou 2013, Aue & van Delft 2020, van
Delft et al. 2021) are only applicable to data in Euclidean or Hilbert spaces.
To bridge this gap, we propose the first definition of the first-order and second-order stationarity
of manifold time series, and develop corresponding testing procedures to determine whether a
manifold time series exhibits either first-order or second-order stationarity, based on our extension
of locally stationary time series to manifolds. The notion of local stationarity, originally formulated
for time series in Euclidean space, assumes a data-generating scheme varying smoothly within
local time intervals (e.g., Priestley 1988, Dahlhaus 1997, Zhou & Wu 2009). In our work, local
3

stationarity allows a proper definition of the second-order stationarity of a manifold time series
which may not be first-order stationary, and connects the manifold calculus with tools of asymptotic
statistics to facilitate derivation of asymptotic properties of our test statistics. Local stationarity is
a reasonable assumption in our real data application to cell biology, as demonstrated in Figure 4A
and related works of cell biology (e.g., Lähnemann et al. 2020).
Our main contributions are summarized as follows:
1. We propose the first definition of the first-order stationarity and second-order stationarity
for manifold time series. Our definition incorporates the stationarity of multivariate time
series in Euclidean space as a special case. As demonstrated in the above examples, these
concepts are crucial for addressing practical scientific inquiries.
2. We develop procedures to test the first-order stationarity of manifold time series based on
the Cumulative Sum (CUSUM) (Page 1954) of residuals in the tangent space at the sample
intrinsic mean. The tangent space at the sample intrinsic mean is not identical to the tangent
space at the population intrinsic mean due to the curved nature of manifolds as shown in
Figure 1(a). This property makes the CUSUM statistic in manifolds more complicated
than the counterpart in Euclidean space. We show that the asymptotic null distribution of
the L2-norm of the CUSUM of residuals induced by the Riemannian metric converges to
the sup-norm of a process in the tangent space at the population intrinsic mean, with the
form U(t) −H(t)−1 ○H(1) ○U(1), where U(⋅) is a centered Gaussian process with an
unknown covariance operator and H(⋅) is an unknown invertible linear operator induced
by the curvatures. We propose a test that leverages techniques of Gaussian multiplier
bootstrap to mimic U(⋅) and estimates the operator-valued function H(t). We establish the
consistency of our method and provide the local alternative distribution to show that our
method has local power with a rate of O(T −1/2), where T is the length of the time series.
4

3. Third, we develop a second-order stationarity test for the manifold time series, and establish
asymptotic properties for the test statistic. One of the major challenges lies in determining the
asymptotic distribution of the test statistics since the curved nature of manifolds introduces
an additional Op(T −1/2) term to the test statistic compared to Euclidean space. Surprisingly,
under certain regularity conditions, the null distribution of the test statistic for manifold
time series is invariant to manifold curvatures, asymptotically converges to a Gaussian
distribution and aligns with its counterparts in Euclidean space. In contrast, under the
alternative hypothesis, although the test statistic still asymptotically follows a Gaussian
distribution, it exhibits a difference in variance from its Euclidean counterpart.
We structure the rest of the paper, as follows. In Section 2, we introduce background of
Riemannian manifolds and Euclidean time series. Section 3 defines the first- and second-order
stationarity for manifold time series. In Section 4, we develop statistical tests for the stationarity
of manifold time series, and establish the corresponding asymptotic properties of the test statistics
under null and alternative hypotheses. Simulations and real data application are presented in
Section 5 and Section 6, respectively. In Section 7, we end with a brief discussion.
2
Background
Before introducing the definition of stationarity and the methods of stationarity test within the
context of manifold time series, we briefly review the concepts of stationarity in Euclidean space
RD, some background of Riemannian manifolds, and the intrinsic mean.
2.1
Stationarity and locally stationary
The notion of stationarity is important as it guarantees the consistency and validity of most of
modelling and testing in time series data analysis (Shumway et al. 2000). The definition of
5

stationarity is given as follows:
• A collection of RD-valued random vectors {Xi}T
i=1 is first-order stationary if E(Xi) ≡µ
for some constant µ ∈RD. It is second-order stationary if the auto-covariance matrix
E{(Xi −EXi)(Xj −EXj)⊺} only depends on the lag ∣i −j∣. If a time series is both first and
second-order stationary, then it is stationary.
In real-world time series data, the stationarity may not always hold. Instead, the local
stationarity was introduced (Dahlhaus 1997, Zhou & Wu 2009). It offers a way to relax the
stationarity assumption, enabling flexible modeling of changes in mean and dependency structures.
Zhou & Wu (2009) defines the local stationarity of time series in Euclidean space RD as follows:
• A collection of RD-valued random vectors {Xi}T
i=1 is a locally stationary time series if
there exists an unknown measurable filter function G such that Xi = G(i/T,Fi), where
Fi = (⋯,ϵ0,⋯,εi−1,ϵi), {εi}i∈Z are i.i.d. random variables, and G satisfies some smooth
conditions.
The above definition includes many time series models, such as time-varying linear processes
and time-varying GARCH models (Bollerslev 1986) satisfying some regularity conditions (Wu &
Zhou 2011, Zhou 2013). If the filter G is further independent of t, then the time series is stationary.
2.2
Riemannian manifold
Below we briefly introduce some basic concepts of Riemannian manifolds that are essential to our
development, with slight emphasis on geometric intuition rather than mathematical rigour. We
refer readers to a self-contained note by Shao et al. (2022) for more details and to the textbook by
Do Carmo (1992) for a more comprehensive treatment.
A topological space M is called a differential manifold of dimension D if it admits a maximal
differentiable atlas that consists of coordinate systems (Uα,xα) for α ∈J, such that ⋃α∈J Uα = M
6

and xα ○x−1
β is differentiable whenever Uα ∩Uβ ≠∅, where J is an index set and each xα ∶Uα →R
is a coordinate map. A curve c ∶(−ϵ,ϵ) →M is differentiable at p if p = c(0) and there exists a
coordinate system (Uα,xα) such that p ∈Uα and xα ○c is differentiable at 0. The tangent vector to
the curve c at t = 0 is a linear functional c′(0) such that for any function f differentiable at p we
have c′(0)f = d(f ○c)(0)/dt. The tangent space at p is the linear space of all tangent vectors at p,
denoted by TpM. The aggregation of all tangent spaces ⋃p∈M TpM is called the tangent bundle of
M, denoted by T M.
A differentiable manifold M is a Riemannian manifold if it is additionally equipped with a
Riemannian metric which defines a smoothly varying inner product ⟨⋅,⋅⟩p ∶TpM × TpM →R
for each point p in M. The Riemannian metric also induces a norm ∥⋅∥p on each TpM, and
induces a distance function on M, denoted by dM(⋅,⋅), so that M endowed with dM(⋅,⋅) is
a metric space. In addition, the Riemannian metric uniquely determines an affine connection
called Levi-Civita connection ∇∶TpM × TpM →TpM, which allows us to connect nearby
tangent spaces and to define the directional derivatives of tangent vectors. An important geometric
characteristic of manifold is the curvature. Formally, the curvature on a Riemannian manifold M
is defined as a tensor, given by RM(U,V ) = ∇V ∇U −∇U∇V + ∇[U,V ], where U,V are two vector
fields on M and [U,V ] = UV −V U. Given a point p ∈M and a two-dimensional subspace of
TpM spanned by two linearly independent tangent vectors u,v ∈TpM, the sectional curvature
is defined as κ(u,v,p) = ⟨RM(u,v)u,v⟩p/(∥u∥2
p∥v∥2
p −⟨u,v⟩2
p). If κ(u,v,p) ≤0 (≥0) for any
(u,v,p) ∈TpM × TpM × M, then we say M is a non-positively-curved (non-negatively-curved)
manifold. For Euclidean space M = RD, one can show that RM ≡0 and κ(u,v,p) ≡0. Intuitively,
the deviation of the curvature tensor or the sectional curvature from 0 quantifies how a manifold
bends or curves.
Let c(t) be a differentiable curve with c(0) = p, and v be a tangent vector in TpM. The
7

parallel transport of v along c(t) is a vector field V (t) defined on Tc(t)M such that V (0) = v and
∇c′(t)V (t) = 0. Denote the parallel transport of v ∈Tc(s)M to Tc(t)M along c by Pc(t)
c(s)(v). The
collection {E1(t),⋯,ED(t) ∶0 ≤t ≤1}, denoted by E, is called a parallel orthonormal frame on
Tµ(t)M, if it satisfies the following conditions:
1. Ek(t) = Pc(t)
c(0)Ek(0) ∈Tc(t)M, for any t ∈[0,1] and k ∈{1,⋯,d}.
2. ⟨Ek(t),El(t)⟩c(t) = δkl for any t ∈[0,1] and k,l ∈{1,⋯,d}, where δkl equals to 1 if k = l,
and 0 if k ≠l.
We write E(t) = {E1(t),⋯,ED(t)}.
A differentiable curve γ is a geodesic if ∇γ′(t)γ′(t) = 0. The concept of geodesic generalizes
the straight line in Euclidean space. For any p ∈M and v ∈M, there exists a unique geodesic
such that γv(0) = p and γ′
v(0) = v, which gives rise to the Riemannian exponential map Expp(v) =
γv(1). There is a neighborhood Ep ⊂TpM such that Expp is bijective on Ep. Therefore, restricting
Expp to Ep, we can define its inverse. This inverse is called the Riemannian logarithmic map at p,
denoted by Logp, satisfyingLogp(Exppv) = v for v ∈Ep.
Let fX(⋅) = d2
M(⋅,X)/2, and denote ∂pfX ∈TpM the Riemannian gradient of fX at p, that
is, for any tangent vector u ∈TpM, u(fX)(p) = ⟨∂pfX,u⟩p. We also let SpM denote the space
of self-adjoint operators on TpM and H(p,X) denote the Riemannian Hessian operator of the
function fX(⋅) at p, i.e., the operator in SpM such that for any tangent vectors u,v ∈TpM,
⟨H(p,X)u,v⟩p = ⟨∇u∂pfX,v⟩p = ⟨∇v∂pfX,u⟩p = ⟨H(p,X)v,u⟩p.
2.3
Intrinsic mean
In curved Riemannian manifolds, the concepts of algebraic addition and the usual mean/average
do not apply. The notion of the intrinsic mean, proposed by Fréchet (1948), serves as a well
established generalization of the traditional mean in the literature. For a random element X in a
8

metric space M with a distance function d, we say µ is the intrinsic mean (or Fréchet mean) of X
if
µ = arg min
p∈M Ed2
M(X,p).
(1)
Unlike the arithmetic mean, which is well-defined for data in Euclidean spaces, the intrinsic mean
extends the idea of finding a central point to spaces where the notion of averaging as simple
arithmetic might not make sense. In particular, the intrinsic mean mimics the Euclidean mean in
the sense that it minimizes the average squared distance to X. The intrinsic mean is a popular
tool to model metric-space (including Riemannian manifolds as a special case) valued data in
different contexts, such as regression for non-Euclidean data (Petersen & Müller 2019, Shao et al.
2022), change-point detection (Jiang et al. 2024, Dubey & Müller 2020) in metric space, and
generalized principal component analysis for manifold-valued data (Pennec 2018). To ensure the
unique exisistence of µ in Eq.(1), we assume one of the following conditions:
(M1) M is a simply connected and complete manifold, with bounded non-positive sectional
curvatures.
(M2) M is a simply connected and complete subset of a complete Riemannian manifold with
positive sectional curvatures upper bounded by κ > 0, and satisfies a bounded diameter
condition: supp,q∈M dM(p,q) < π/κ1/2.
3
Stationarity on Riemannian Manifolds
In this section, we introduce the definition of stationarity and local stationarity of manifold time
series. Let M be a Riemannian manifold of dimension D satisfying conditions (M1) or (M2),
and µ(t) ∶[0,1] →M be a smooth curve on M, associated with a parallel orthonormal frame
E = {E1(t),⋯,ED(t) ∶0 ≤t ≤1}. For any e = (e1,⋯,eD) ∈RD, e⊺E(t) denotes the vector
9

in Tµ(t)M with coordinate-representations (e1,⋯,eD) under the basis {E1(t),⋯,ED(t)}, i.e.,
e⊺E(t) = ∑D
j=1 ejEj(t).
Definition 1 (first-order stationarity) A manifold time series {Xi}T
i=1 on M is first-order stationary
if there exists µ ∈M such that µ = arg minp∈M Ed2
M(Xi,p) holds for all i = 1,⋯,T, i.e., when its
intrinsic mean stays constant.
Before defining the second-order stationarity for manifold time series, we need to introduce the
notion of local stationarity. Traditionally, the second-order stationarity in Euclidean space is
defined for first-order stationary time series. However, it is common in practice that a time series
is trend-stationary, i.e., it is second-order stationary after subtracting a deterministic trend. In
order to incorporate this wider sense of second-order stationary in manifold time series, we first
introduce the local stationarity.
Definition 2 (local stationarity) A manifold time series {Xi}T
i=1 on M is locally stationary with
the mean function µ(t) if there exists a parallel orthonormal frame E = {E1(t),⋯,ED(t) ∶0 ≤
t ≤1} and an RD-valued processes {ei}T
i=1 such that, with ti = i/T,
• ei = GE(ti,Fi) for some unknown measurable filter function, where Fi = (⋯,ε0,⋯,εi−1,εi)
and {εi}i∈Z are i.i.d random variables,
• Logµ(ti)Xi = e⊺
iE(ti) with µ(ti) = arg minp∈M Ed2
M(Xi,p).
Local stationarity in manifold time series describes a data generating mechanism that varies
continuously over time, where in a short time interval, the statistical characteristics for the
time series, such as the intrinsic mean of the time series, do not significantly change.
In
addition, Logµ(ti)Xi = e⊺
iE(ti) implies Xi = Expµ(ti){e⊺
iE(ti)}, ensuring that the observations
X1,X2,...,XT sampled from the data generating mechanism fall onto the manifold M. In
10

contrast, analyses of the manifold time series while ignoring the manifold structure (e.g., via
embedding the manifold into a Euclidean space and performing the analyses therein) may not
preserve this important property.
Remark 1 Throughout this manuscript, our definition of local stationarity follows the framework
of Zhou & Wu (2009). We also recognize an alternative definition for Euclidean and functional
time series discussed in Dahlhaus (1997), van Delft & Eichler (2018), which differs from that of
Zhou & Wu (2009) by a factor of Op(1/T) under certain regularity conditions. Our theoretical
results can be extended to accommodate this alternative with minimal adjustments.
To introduce the concept of second-order stationarity, we note that for a locally stationary
manifold time series {Xi}T
i=1 as in the above definition, we have E[ei] = 0. For a fixed orthonormal
frame E, let Cij = E(eie⊺
j) be the covariance matrix of coordinate-representation for Logµ(i/T)
and Logµ(j/T) under E.
Definition 3 (second-order stationarity) A locally stationary manifold time series {Xi}T
i=1 on
M with mean function µ(t) is second-order stationary if Cij depends on i,j only through ∣i −j∣.
If a manifold time series is both first- and second-order stationary, then we say it is stationary. Our
definition of stationarity extends the traditional notion from Euclidean space to general Riemannian
manifolds. When M is the Euclidean space endowed with the canonical inner product, then our
definition of both first- and second-order stationarity is identical to the classical definition as given
in Section 2.1. The definition is also invariant to the choice of the parallel orthonormal frames,
i.e., if E and E′ are two parallel orthonormal frames along µ(t) and {Xi}T
i=1 is first-order and/or
second-order stationary under E, then it is also first-order and/or second-order stationary under E′.
In fact, the concept of second-order stationarity in Euclidean space also (implicitly) depends on
parallel orthonormal frames; see Remark 2 for elaboration.
11

The above three definitions provide tools to characterize dynamic states of different levels
for manifold time series. For example, for the aforementioned cell developmental data, first-
order stationarity of cell-type composition time series indicates that cell-type transitions reach
an equilibrium state, while second-order stationarity suggests that the randomness in cell-type
transitions, caused by noise in sampling procedures or cellular birth and death, remains constant
over time. In addition, the proposed local stationarity can serve as a valuable tool for modeling
multi-resolution and continuous cellular developmental processes, particularly observed in tissue
generation (Lähnemann et al. 2020).
Remark 2 One may notice that the definition of second-order stationarity in Riemannian manifold
is defined through the parallel orthonormal frame, while in Euclidean space, the definition
of stationarity appears to be free of orthonormal frames. However, we show that even for
Euclidean space, the second-order stationarity is implicitly defined on the parallel orthonormal
frame, and the second-order stationary may not hold if the basis along the mean is no longer a
parallel orthonormal frame. For example, let {(Zi,1,Zi,2)}T
i=1 be an i.i.d sequence of standard
Gaussian random vectors in R2, and Xi = (i/T + 0.5 ⋅Zi,1 + 0.5 ⋅Zi,2, i/T + 0.3 ⋅Zi,1 + 2 ⋅Zi,2).
Then {Xi}T
i=1 is a second-order stationary time series with a linear trend. Let E1 = (1,0) and
E2 = (0,1) be the canonical orthonormal basis in R2, and E1(t) = cos(t)E1 + sin(t)E2 and
E2(t) = −sin(t)E1 + cos(t)E2 be a set of time-varying orthonormal basis for 0 ≤t ≤1, which
is no longer parallel. The coordinate representation of the detrend time series {Xi −i/T}T
i=1
under the frame {E1(i/T),E2(i/T)}, is not stationary because the autocovariance matrix of the
coordinate representation {ei}T
i=1 depends on i.
12

4
Tests of Stationarity
The real-world examples in the introduction highlight the considerable scientific importance of
assessing the stationarity in manifold time series. In this section, we introduce detailed statistical
testing procedures for both first- and second-order stationarity in manifold time series.
4.1
First-order stationarity test
Let M be a Riemannian manifold of dimension D, and {Xi}T
i=1 be a locally stationary time
series with mean function µ(t) satisfying Definition 2. Let E = {E1(t),⋯,ED(t) ∶0 ≤t ≤1}
be a fixed parallel orthonormal frame on µ(t) and {ei}T
i=1 be the coordinate-representation of
Logµ(i/T)Xi ∈Tµ(i/T)M under the basis {E1(i/T),⋯,ED(i/T)}, for i = 1,⋯,T. We consider the
following null and alternative:
H0 ∶µ(t) ≡µ for some constant µ ∈M, versus H1 ∶µ(t) is a non-constant smooth curve.
We employ a CUSUM statistic to construct a test for these hypotheses. First, we estimate µ by
the empirical intrinsic mean ˆµ = arg minp∈M T −1 ∑T
i=1 d2
M(p,Xi). Then, with vi = LogˆµXi and
Sj = ∑
j
i=1 vi, we introduce the test statistic
QT = max
1≤j≤T ∥T −1/2Sj∥ˆµ.
Under H1, one would expect the CUSUM statistic QT to be larger compared to its value when
H0 is valid.
To develop a test based on the CUSUM statistic QT, we study the asymptotic property of QT,
starting with introducing some technical definitions and regularity conditions. As the manifold
time series may contain complex dependency structures, we first introduce an additional quantity
13

to quantify the temporal dependency; similar dependency measures can also be found in Wu
(2005) and Zhou (2013).
Definition 4 Let {Xi}T
i=1 be a locally stationary time series as in Definition 2, and {ε′}i∈Z an i.i.d
copy of {ε}i∈Z. Assume that max1≤i≤T E∥ei∥p
p < ∞for some positive p, where ∥⋅∥p is the Lp-norm
in Euclidean space. Then for any integer k > 0, the k-th physical dependence measure is
δp(k,GE) ∶= sup
0≤t≤1(E∥GE(t,Fk) −GE(t,(F−1,ε′
0,ε1,⋯,εk))∥p
p)1/p.
(2)
If k ≤0, we take δp(k,GE) ∶= 0 conventionally.
We also assume the following regularity conditions for establishing the asymptotic distributions of
the test statistic.
(A1) The Hessian tensor H(p,X) is LH-Lipschitz continuous in p given X, and LH-Lipschitz
continuous in X almost surely for any fixed p, where LH < ∞is uniformly bounded.
(A2) There exists some finite constant C such that E∥GE(t,F0) −GE(s,F0)∥2 ≤C∣s −t∣, and
EdM (Expµ(t){GE(t,F0)⊺E(t)},Expµ(s){GE(s,F0)⊺E(s)}) ≤C∣t −s∣,
∀s,t ∈[0,1].
(A3) δ4(k,G) = O(αk) for some α ∈[0,1), where δ4(k,G) is defined in Definition 4.
(A4) Let ΣE(t) = ∑k∈Z E{GE(t,F0)GE(t,Fk)⊺} for t ∈[0,1], where Z is the set of all integers.
We assume the smallest eigenvalue of ΣE(t) is bounded away from 0 uniformly over
t ∈[0,1].
(A5) sup
0≤t≤1P(∥GE(t,F0)∥2 ≥M) ≤exp(−C1M) for some constant C1 < ∞and any M > 0, i.e.,
GE(t,F0) is uniformly sub-exponential.
14

The above assumptions, whose Euclidean counterparts are common in the literature, are further
discussed in Remark 3. A concrete example satisfying the above conditions is provided in Remark
4. The following lemma plays an important role in the investigation of the asymptotic properties
of QT, and SµM is defined in Section 2.2.
Remark 3 The assumption (A1) holds when the support of data is a bounded subset of M, and
can be replaced with sub-Gaussian conditions, for example, max1≤i≤T P(dM(Xi,µ) > M) ≤
exp(−CM 2) for some positive constant C < ∞and any M > 0. Euclidean counterparts of
Assumptions (A2)-(A4) are common in the literature of stationarity test, such as Zhou (2013).
The condition (A5) is required to control the variation induced by the curved nature of manifolds.
Stronger conditions were used in previous works of non-Euclidean data analysis. For example,
Petersen & Müller (2019), Dubey & Müller (2020) assumed bounded support of data.
Remark 4 We give an example satisfying Assumptions (A1)-(A5). Let M be the space of 3 × 3
SPD matrices with the affine-invariant metric (Moakher 2005). Let µ(t) be a geodesic such
that µ(0) = I3 and µ(1) = 1.5I3. Let {Ej,k(0)}1≤j≤k≤3 ⊂Sym3 be a set of 3 × 3 symmetric
matrices with 1 at the (j,k) and (k,j) entries and 0 at the remaining entries. One can show
that {Ej,k(0)}1≤j≤k≤3 is an orthogonal basis of Tµ(0)Sym+
3, with ∥Ej,k(0)∥µ(0) = 1 for j = k, and
∥Ej,k(0)∥µ(0) =
√
2 for j ≠k. Let {Ej,k(t) ∶1 ≤j ≤k ≤3, 0 ≤t ≤1} be the parallel orthogonal
frame along µ(t) with initial value Ej,k(0). For simplicity in notations, we also let ti = i/T. A
time-varying auto-regressive processes satisfying our conditions are given as follows:
Logµ(ti)Xi+1 = (0.05 + 0.25ti)Pµ(ti)
µ(ti)Logµ(ti+1)Xi + {(ti −0.5)2 + 0.2}εi,
where εi = ∑1≤j≤k≤3 Zi,j,kEj,k(ti), and the collection of Zi,j,k are independent Gaussian random
variables such that Zi,j,k ∼N(0,1) if j = k and Zi,j,k ∼N(0,1/4). Here, Pµ(t)
µ(s) is the parallel
15

transport map from µ(s) to µ(t) along µ.
Lemma 1 Let Hi = H(µ,Xi). If Assumptions (A1)-(A5) hold and µ(t) ≡µ for some constant
µ ∈M, then d(ˆµ,µ) = Op(T −1/2). In addition, there uniquely exists H(t) ∶[0,1] →SµM, an
SµM-valued function, such that sup1≤k≤T ∥H(k/T) −T −1 ∑k
i=1 Hi∥µ = Op(T −1/2).
We are ready to present the asymptotic null distribution of QT in the following theorem.
Theorem 1 If Assumptions (A1)-(A5) hold and that µ(t) ≡µ for some constant µ ∈M, then
QT
D→sup
0≤t≤1∥U(t) −H(t) ○H−1(1) ○U(1)∥µ,
(3)
where H is introduced in Lemma 1 and U(t) = u(t)⊺E(0) with u(t) being a centered Gaussian
process with covariance function Σu(t,s) = ∫
min(t,s)
0
ΣE(ξ)dξ.
Theorem 1 states that the null distribution of the test statistic QT converges to the distribution of
the sup-norm of a centered Gaussian process defined on TµM. In Euclidean space and Hilbert
space, the operator valued function H(t) is given by H(t) = t ○Id. In this case, we have
H(t) ○H−1(1) = t ○Id and QT weakly converges to sup0≤t≤1 ∥U(t) −tU(1)∥2, which is identical
to the convergence of the asymptotic distribution of T −1/2 max1≤k≤T ∥∑1≤j≤k Xj −T −1 ∑1≤l≤T Xl∥
as given in Zhou (2013). However, for a first-order stationary time series in a general Riemannian
manifold with non-vanishing curvatures, H(t) ○H−1(1) ≠t ○Id, and the test proposed by Zhou
(2013) is no longer valid since it does not include the additional term H(t) induced by the
curvature. Intuitively, the difference between H(t) ○H−1(1) and t ○Id is induced by the deviation
shown in Figure 1, i.e., the deviations of Pµ
ˆµLogˆµXi from LogµXi −Logµˆµ.
The limiting process established by Theorem 1 includes two components, specifically, a
Gaussian random process U(t) with a complicated covariance function and a deterministic
operator-valued function H(t). To perform a valid test under null hypothesis, we propose to
16

Figure 1: Left Panel: Illustration on how a curved manifold differs from Euclidean space and
affects the CUSUM statistics. Assume µ is the population intrinsic mean, ˆµ is the sample intrinsic
mean, and Xi is a data point in M. Let vi = LogµXi and ˆvi = LogˆµXi. The red star ★represents
Pµ
ˆµvi, and the square ∎represents vi −Logµˆµ. In Euclidean space, Pµ
ˆµ ˆvi = Xi −ˆµ, vi = Xi −µ,
Logµˆµ = ˆµ−µ, and thus Pµ
ˆµvi = vi −Logµˆµ, or equivalently Xi −ˆµ = (Xi −µ)−(ˆµ−µ). However,
in a curved manifold, as shown in the figure, Pµ
ˆµvi (★) deviates from vi −Logµˆµ (∎); this deviation
contributes to the CUSUM statistics, which is unknown and need to be estimated from data. Right
Panel: Illustration of the local alternative. We consider a perturbation τ(T) ⋅b(t) on the tangent
space TµM. Let γ(s,t) = Expµ(s ⋅b(t)). As T →∞, τ(T) converges to 0, and the mean function
µT(t) = γ(τ(T),t) converges to µ.
approximate the deterministic function H(t) by a CUSUM statistic and bootstrap the random
process U(t) by adapting the Gaussian multiplier bootstrap in Zhou (2013). Specifically, for
t = k/T with some positive integer k, we take ̂
H(t) = T −1 ∑k
i=1 H(ˆµ,Xi) as an estimate of H(t).
Roughly speaking, ̂
H(⋅) can be viewed as a plug-in estimate of H(⋅) by substituting µ with ˆµ.
We bootstrap the Gaussian process U(t) by a moving-block multiplier bootstrap procedure, as
follows. Let n be a fixed block size. For each bootstrap sample, generate i.i.d standard Gaussian
random variables {Rk}T−n+1
k=n
. For t = k/T with k ∈{1,...,T}, define U⋆(t) = ∑k
j=1{n(T −
n + 1)}−1/2Rj ∑
j+n−1
i=j
LogˆµXi. Via resampling from U⋆, we can obtain an estimate of the null
distribution of QT; see Algorithm 1, where a test procedure is provided in Step 5. The following
theorem establishes the consistency of the Gaussian multiplier bootstrap method with curvature
term adjustment under the null, showing that the proposed test procedure is asymptotically valid.
17

Theorem 2 Suppose that the assumptions in Theorem 1 hold and the block-size n ∶= n(T) satisfies
limT→∞n(T) = ∞, and limT→∞T −1n(T) = 0. Under H0, conditioning on {Xi}T
i=1, we then have
Q(b)
T
D→sup
0≤t≤1∥U(t) −H(t) ○H−1(1) ○U(1)∥µ.
(4)
Algorithm 1 Curvature Adjusted Multiplier Bootstrap (CAMB)
Input: Manifold time series {Xi}T
i=1, bootstrap sample size B, and the significant level α.
1.
Estimate empirical intrinsic mean ˆµ = arg minp∈M T −1 ∑i d2
M(p,Xi).
Estimate the
Riemannian Hessian tensor ˆHi by the plug-in estimator ˆHi = H(ˆµ,Xi), and the tensor-valued
process ˆHj = T −1 ∑
j
i=1 ˆHi, j = 1,⋯,T.
2. Compute the residuals vi = LogˆµXi, and determine the moving-block size n by the minimum-
volatility method (Politis et al. 2012).
3. Compute the CUSUM Sj = ∑
j
i=1 vi for 1 ≤j ≤T, the test statistic QT, and the moving-block
local sum Sj,n = ∑
j+n−1
i=j
vi for 1 ≤j ≤T −n + 1.
4. Generate bootstrap samples of QT:
for b = 1,⋯,B do
i. Generate T −n + 1 i.i.d standard normal random variables {R(b)
j }T−n+1
j=1
.
ii. V (b)
k,n = ∑k
j=1{n(T −n + 1)}−1/2Sj,nR(b)
j , for k = n,⋯,T −n + 1.
iii. Q(b)
T
= maxn≤k≤T−n+1 ∥V (b)
k,n −ˆHk ○ˆH−1
T ○V (b)
T−n+1,n∥ˆµ.
end for
5. Obtain the bootstrap p-value = (B−1)∑B
b=1 I{Q(b)
T
≥QT}, and reject H0 if p-value ≤α.
Next we study the asymptotic local power of the proposed test, where we utilize tools of
parametrized surfaces in manifolds (Do Carmo 1992). Let µ ∈M be a constant, and b(t) ∶[0,1] →
TµM be a smooth curve, γ(s,t) = Expµ{s ⋅b(t)}, 0 ≤s,t ≤1 be a parametrized surface near µ,
and {Ej(s,t),j = 1,⋯,d, 0 ≤s,t ≤1} a collection of vector fields such that
• For fixed s, {Ej(s,t), 0 ≤t ≤1 ,j = 1,⋯,d} is a parallel orthonormal frame along γ;
• Ej(s,t) ∶[0,1] × [0,1] →T M is smooth on [0,1] × [0,1] for all j = 1,⋯,d.
We consider the following local alternative hypothesis under the locally stationary scheme:
µT(t) = γ(τ(T),t), with τ(T) being a non-negative sequence s.t. lim
T→∞τ(T) = 0,
(5)
18

for a locally stationary time series {Xi}T
i=1 as in Definition 2 with
LogµT (i/T) = e⊺
iE(τ(T),i/T).
(6)
A visual illustration of this local alternative is provided Figure 1(b). This local alternative scheme
possesses two properties. First, for each T, the data is locally stationary associated with the
mean curve µT(⋅) and parallel orthonormal frame E(τ,⋅). Second, as T →∞, the time series
smoothly changes and uniformly converges to a first-order stationary time series at a rate τ(T). In
Euclidean space, this local alternative scheme is identical to the case where Xi = µT(i/T) + ei
with µT(t) = µ+τ(T)b(t) for a smooth function b(t) and {ei}T
i=1 is a zero-mean locally stationary
time series. The following theorems present the asymptotic results for the local alternative.
Theorem 3 Assume (A1)-(A5) and the local alternative hypothesis given by Eq.(5) and Eq.(6).
1. If limT→∞T 1/2τ(T) →∞, then QT →∞almost surely.
2. If τ(T) = T −1/2, then, with H(t) defined in Lemma 1, we have
QT
D→sup
0≤t≤1∥U(t) −H(t) ○H−1(1) ○U(1)
+ H(t) ○H−1(1) ○∫
1
0
∂
∂ξH(ξ) ○b(ξ)dξ −∫
t
0
∂
∂ξH(ξ) ○b(ξ)dξ∥µ.
(7)
Theorem 4 Under the conditions of Theorem 3, if we further assume that limT→∞n(T) = ∞and
limT→∞n(T)1/2τ(T) = 0, then the bootstrap procedure in Algorithm 1 is consistent in the sense
that, conditioning on {Xi}T
i=1, Q(b)
T
D→sup0≤t≤1 ∥U(t) −H(t) ○H−1(1) ○U(1)∥µ.
Theorem 4 suggests that, even under the local alternative, the bootstrap samples Q(b)
T
are asymp-
totically drawn from the limiting null distribution, with some suitable block size n that meets
a stronger condition n(T)1/2τ(T) →0 compared with those in Theorem 2. Theorems 3 and 4
19

together show that our method can detect the first-order non-stationarity with rate T −1/2 and has
asymptotic power 1 whenever limT→∞n(T)1/2τ(T) = 0 and limT→∞n(T) = ∞. Note that, in
Theorem 3, if b(⋅) ≡0, i.e., under the null hypothesis, the asymptotic distribution of QT given by
Eq.(7) is identical to the one in Theorem 2.
Remark 5 Our test for first-order stationarity differs from previous change point detection
methods (Dubey & Müller 2020, Wang et al. 2023, Jiang et al. 2024) by examining whether
the mean is constant or varies (continuously or discontinuously) over time, allowing gradual
changes. In contrast, their methods detect abrupt changes and assume the time series can be
segmented into blocks of constant mean and variance, a condition not required in our test.
4.2
Second-order stationarity test
If a manifold time series is first-order stationary, it is natural to further test the second-order
stationarity. Below we propose a second-order stationarity test for first-order stationary manifold
time series using local spectral density (Dahlhaus 1997, Dette et al. 2011, van Delft et al. 2021).
Let {Xi}T
i=1 be a first-order stationary manifold time series with constant intrinsic mean µ. The
local spectral density of the coordinate representation of {LogµXi}T
i=1 under a given orthonormal
frame E, i.e., the time series {ei}T
i=1, is FE(ω,t) = (2π)−1 ∑h∈Z E{GE(t,F0)G⊺
E(t,Fh)}e−iωh, λ ∈
[−π,π], where we define i =
√
−1 throughout this paper. Under some technical assumptions
introduced later, the local spectral density is well-defined. The second-order stationarity of
{Xi}T
i=1 is equivalent to FE(ω,t) ≡FE(ω) a.e. on [−π,π] × [0,1], for some function FE(ω).
20

Thus, testing the second-order stationarity is equivalent to testing the following hypothesis:
H0 ∶FE(ω,t) ≡FE(ω), a.e. on [−π,π] × [0,1];
H1 ∶FE(ω,t) ≠FE(ω) for all FE(ω) on a subset of [−π,π] × [0,1]
with positive Lebesgue measure.
(8)
We then define squared variation of FE(ω,t) by
V 2
F = ∫
π
−π ∫
1
0 ∥FE(ω,u)∥2
HSdudω −∫
π
−π ∥¯FE(ω)∥2
HSdω,
where ¯FE(ω) = ∫
1
0 FE(ω,t)dt and ∥⋅∥HS is the Hilbert-Schmidt norm of complex matrices. Since
Hilbert-Schmit norm is invariant under unitary transformation, V 2
F is independent of choices of E.
Note that V 2
F = 0 if and only if FE(ω,t) ≡¯FE(ω), a.e. on [−π,π] × [0,1]. Thus, the testing the
hypothesis in (8) is equivalent to testing
H0 ∶V 2
F = 0, versus H1 ∶V 2
F > 0.
(9)
We adapt the technique from van Delft et al. (2021), initially created for assessing second-order
stationarity in functional time series, into our context of manifold time series. Let m and n be two
positive integers such that mn = T and n is even. The intuition is to split the time series into m
blocks of size n, and then estimate the local spectral density and the squared variation V 2
F . Let ˆµ
be the empirical intrinsic mean and In(ω,t) = Jn(ω,t) ⊗Jn(ω,t), where ⊗is the complex tensor
product (i.e., conjugation included) and
Jn(ω,t) = (2πn)−1/2
n−1
∑
h=0
LogˆµX⌊tT⌋−n/2+1+h ⋅e−ihω.
21

Then the coordinate representation of In(λ,t) under any orthonormal frame at ˆµ can serve as an
estimator of F(λ,t), and the test statistic is given by
V 2
ˆF = 4πT −1
n/2
∑
k=1
m
∑
j=1
⟨In(ωk,tj),In(ωk−1,tj)⟩HS + ̂
W−
4πn−1
n/2
∑
k=1
⟨m−1
m
∑
j=1
In(ωk,tj),m−1
m
∑
j=1
In(ωk,tj)⟩HS,
where ωk = 2kπ/n, tj = n(j −0.5)/T, and ̂
W = T −1 ∑
n/2
k=1 ∑m
j=1 ∥Jn(ωk,tj)∥2
ˆµ∥Jn(ωk−1,tj)∥2
ˆµ.
To develop a test based on the statistic V 2
ˆF , we proceed with studying its asymptotic distribution.
Let Yi(t) = (Yi,1(t),⋯,Yi,d+d(d+1)/2(t)) be a vector such that (Yi,1(t),⋯,Yi,d(t)) = GE(t,Fi) and
Yi,d+(d−j/2)(j−1)+k(t) = ⟨Ej(t),H(µ,Expµ{GE(t,Fi)⊺E(t)}) ○Ej+k(t)⟩µ, 1 ≤j ≤d, j ≤k ≤d.
Given k random variables Z1,⋯,Zk, we denote the kth order joint cumulant of k random variables
{Z1,⋯,Zk} by cumk(Z1,⋯,Zk). We assume that {Xi}T
i=1 satisfies the following conditions. For
every even number k ∈N, there exists a positive sequence αk;i1,⋯,ik−1 such that, for all j = 0,⋯,k−1
and for some ℓ∈N, we have ∑i1,⋯,ik−1(1 + ∣ij∣ℓ)αk;i1,⋯,ik < ∞, and
(C1) sup0≤t≤1 ∣cumk{Yi1,l1(t),⋯,Yik,lk(t)}∣≤αk;i1−ik,⋯,ik−1−ik, for all (l1,⋯,lk) ∈{1,⋯,d+d(d+
1)/2}k,
(C2) sup0≤t≤1 ∣∂
∂tℓcumk{Yi1,l1(t),⋯,Yik,lk(t)}∣≤αk;i1−ik,⋯,ik−1−ik, for all (l1,⋯,lk) ∈{1,⋯,d +
d(d + 1)/2}k.
The conditions (C1) and (C2) are proposed to guarantee the existence of the local spectral density
and the weak convergence of test statistics. Similar conditions are used in previous works (van
Delft et al. 2021).
22

Theorem 5 If {Xi}T
i=1 is a first-order stationary time series, the conditions (A1)-(A5) and (C1)-
(C2) hold, and T 1/2 ≪n ≪T 2/3, then under both null and fixed alternative hypotheses, we
have
T 1/2(V 2
ˆF −V 2
F )
D→N(0,σ2
V ), as T →∞,
where 0 < σ2
V < ∞.
Theorem 6 Suppose conditions (A1)-(A5) and (C1)-(C2) hold. Then under H0 of (9), we have
σ2
V = 4π ∫
π
−π ∥¯FE(ω)∥4
HSdω. In addition, the estimator
ˆσ2
V = 16π2n−1
n/2
∑
k=1
(m−1
m
∑
j=1
⟨In(ωk−1,tj),In(ωk,tj)⟩HS)
2
(10)
converges to 4π ∫
π
−π ∥¯FE(ω)∥4
HSdω in probability under both H0 and H1.
The above theorem implies that ˆσ2
V consistently estimates σ2
V under H0.
It also suggests
that, for a significant level α, we can conduct the test by rejecting the null hypothesis H0 if
T 1/2V 2
ˆF /ˆσV ≥z1−α, where z1−α is the 1 −α quantile of standard normal distribution. Under H1,
the quantity ˆσ2
V given by (10) also converges to 4π ∫
π
−π ∥¯FE(ω)∥4
HSdω in probability. Thus, when
0 < 4π ∫
π
−π ∥¯FE(ω)∥4
HSdω < ∞, the probability of rejecting the null hypothesis H0 is approximately
Φ{(4π ∫
π
−π ∥¯FE(ω)∥4
HSdω)1/2z1−α/σV +T 1/2V 2
F /σV }, where Φ(⋅) is the CDF of the standard normal
distribution. This result implies that the test has asymptotic power 1 as T →∞under any fixed
alternative H1.
Interestingly, Theorem 5 implies that under certain regularity conditions, the null distribution of
our test statistic for the second-order stationarity is not affected by the curvature. This is in contrast
to the first-order stationarity test, where curvature does have an impact on the null distribution. The
reason is that the curvature effect in 4πT −1 ∑
n/2
k=1 ∑m
j=1⟨In(ωk,tj),In(ωk−1,tj)⟩HS is asymptotically
neutralized by the curvature effect in (4πn−1)∑
n/2
k=1⟨m−1 ∑m
j=1 In(ωk,tj),m−1 ⋅∑m
j=1 In(ωk,tj)⟩HS
under the null hypothesis. Under the alternative hypothesis, the curvature effect exists and
23

asymptotically has a form of ⟨U,Logµˆµ⟩µ, where U is an unknown deterministic vector in TµM
and vanishes when M is Euclidean space. Thus, the asymptotic alternative distribution of
T 1/2(V 2
ˆF −V 2
F ) is a Gaussian distribution, with a variance different from the one in Euclidean
or Hilbert space. The asymptotic variance under the alternative is complex, and therefore not
included here; details on the asymptotic behavior related to the curvature impact on the second-
order stationarity test can be found in Section S1.6 of the supplementary materials.
Remark 6 Our second-order stationarity test addresses a different setting from the variance
change detection in non-Euclidean data proposed by Dubey & Müller (2020) and Jiang et al.
(2024). Their focus is on detecting changes in Ed2
M(µ,Xi), which, in our context, corresponds
to testing whether the trace of the time-varying matrix FE(0,t) remains independent of t. In
contrast, our test examines the entire variance structure, not just its trace, and also accommodates
continuously varying variance over time.
Remark 7 Although the null distribution of the test statistic for the second-order stationarity on
the manifold is the same as Hilbert space or Euclidean space, the block size n is more restricted.
For the test in Hilbert space, the upper bound of n is of order T 3/4 (van Delft et al. 2021), while
for the general manifold it is of order T 2/3, since the curvature effect introduces a bias of order
T −1n3/2 in non-trivial manifolds, as discussed in the supplementary materials.
Remark 8 The constant mean assumption is common in the literature on second-order stationarity
tests (Dette et al. 2011, Preuß et al. 2013, van Delft et al. 2021), and when µ is non-constant but
smooth, it can be estimated (van Delft et al. 2021). For instance, we can estimate µ using methods
from Petersen & Müller (2019), Lin & Müller (2021), and then estimate Jn(λ,t) by parallel
transporting Logˆµ(i/T)Xi to ˆµ(1/T) along the estimated curve ˆµ(⋅) as a detrending procedure.
24

5
Simulations
In this section, we conduct Monte Carlo simulation experiments to study the finite sample
performance of our proposed testing procedures in two cases: (i) hypersphere, a positively-curved
manifold example; (ii) SPD-matrices endowed with negatively curved manifold structure.
5.1
Simulations for first-order stationarity test on spherical time series
In the numerical study of first-order stationarity test, we consider the following two settings, and
report results for Type-I error rates under null and power under alternative hypothesis.
Setting (i): We simulate locally stationary time series on S6 = {x ∈R7 ∶∥x∥2
2 = 1}, as follows.
Let ti = i/T for 1 ≤i ≤T. Take µ(t) be the geodesic such that µ(0) = (0,0,0,0,0,0,1) and
µ(1) = (1,0,0,0,0,0,0), and µτ(t) = µ(τt) be a re-scaled version of µ(t) for τ ∈[0,1]. We also
denote Pµτ(t)
µτ(s)(⋅) the parallel transport map from µτ(s) to µτ(t) along the geodesic µτ(⋅), which
is equivalent to the parallel transport map from µ(τs) to µ(τt) along the geodesic µ(⋅) in this
simulation setting. For j = 1,...,6, let Ej(0) be the vector with 1 at the jth and with 0 at the other
entries; we view {Ej(0)}6
j=1 as an orthonormal basis of Tµ(0)S6. Then we consider the following
time-varying auto-regressive models
M1(τ) ∶Logµτ(ti+1)Xi+1 = {0.05 + 0.5ti ⋅(1 −ti)}Pµτ(ti+1)
µτ(ti) Logµτ(ti)Xi + (1 + τ)−1εi,
(11)
where εi = ∑6
j=1 σj(τ,ti)Zi,jEj(τt), σj(τ,ti) = (1.1 + 1.1ti)/(1 + τ) if j = 1,2,3 and σj(τ,ti) =
1/(1 + τ) if j = 4,5,6, Zi,j
i.i.d
∼Unif(−0.5,0.5), and {Ej(t),1 ≤j ≤6, 0 ≤t ≤1} is a parallel
orthonormal frame along µ(t), i.e., Ej(t) = Pµ(t)
µ(0)Ej(0). The parameter τ in Eq.(11) determines
the deviation of the time series from first-order stationarity. When τ = 0, the time series is
first-order stationary. As τ increases, the model will deviate from the null and we use it to
25

evaluate the performance of the test statistic under the alternative. In this setting, we consider
τ = 0.125, 0.25, 0.375, 0.75,1.0.
Setting (ii): We next consider Sym+
3, the space of 3 × 3 SPD matrices endowed with the
affine-invariant metric (Moakher 2005), which is a six-dimensional negatively-curved Riemannian
manifold. Let µ(t) be a geodesic joining I3 and 2I3 such that µ(0) = I3 and µ(1) = 2I3, and
define µτ(t) = µ(τt). Let {Ej,k(0)}1≤j≤k≤3 ⊂Sym3 be the set of 3 × 3 symmetric matrices with 1
at the (j,k) and (k,j) entries and 0 at the remaining entries. Note that {Ej,k(0)}1≤j≤k≤3 form an
orthogonal basis of Tµ(0)Sym+
3, with ∥Ej,k(0)∥µ(0) = 1 for j = k, and ∥Ej,k(0)∥µ(0) =
√
2 for j ≠k.
Let {Ej,k(t) ∶1 ≤j ≤k ≤3, 0 ≤t ≤1} be the parallel orthogonal frame along µ(t) with initial
value Ej,k(0). We simulate the following time-varying auto-regressive process:
M2(τ) ∶Logµτ(ti+1)Xi+1 = (0.05 + 0.25ti)Pµτ(ti+1)
µτ(ti) Logµτ(ti)Xi+(1 + 2τ)−1{6.25(ti−0.25)2+0.2}εi,
where εi = ∑1≤j≤k≤3 Zi,j,kEj,k(τti), and the collection of Zi,j,k are independent Gaussian random
variables such that Zi,j,k ∼N(0,1) if j = k and Zi,j,k ∼N(0,1/4). Similarly, Pµτ(t)
µτ(s)(⋅) is
the parallel transport map from µτ(s) to µτ(t) along the geodesic µτ(⋅). When τ = 0, the
manifold time series is first-order stationary. For power study under alternative, we consider
τ = 0.25, 0.5, 0.75, 1, 1.25, 1.5, 1.75.
For the first-order stationary test in both scenarios, we consider T = 50,100,500. The number
of Monte Carlo runs is 5000. The null distribution of the test statistic QT is estimated by the
curvature adjusted multiplier bootstrap (CAMB) method in Algorithm 1. We compare the proposed
test against two approaches: (B1) a method that bootstrap supt ∥U(t) −tU(1)∥µ, neglecting the
curvature effect H(t), and (B2) an approach that considers manifold time series as Euclidean
multivariate time series, utilizing the multiplier bootstrap technique suggested by (Zhou 2013),
thereby overlooking the manifold structure. The bootstrap sample size is set as B = 2000 for all
26

Figure 2: Simulated power curves for the first-order stationarity test. Left Panel: power curve
for first-order stationarity test of spherical time series. Right Panel: power curve for first-order
stationarity test of SPD-matrix-valued time series. The significant level is 0.05.
methods in this benchmark study.
The Type-I error rates of our method and the two comparison methods are reported in Table
1. In the sphere scenario, we find that the Type-I error rates of the two comparison methods
are inflated, while our method controls the Type-I error well. In the context of SPD matrices,
the first comparison method, B1, which overlooks the curvature, tends to be overly conservative
in this instance, exhibiting an empirical rejection probability of approximately 0.009. This
conservative approach may result in diminished power under the alternative hypothesis. Results of
the comparison method B1 in both scenarios numerically support that the curvature term H(t)
plays an important role in manifold time series. On the other hand, the second method, B2, which
treats the data as Euclidean multivariate time series, leads to an escalation in the eigenvalues of the
sample mean and variance. Consequently, the variance associated with the multiplier bootstrap
also surges, rendering the Type-I error rates for this approach unreliable in this scenario.
We also evaluate the power of the first-order stationary test by varying τ. For each fixed τ, the
power is calculated based on 5000 repetitions of Monte Carlo runs. We plot the power curve in
Figure 2, and as expected, one can observe that the test becomes more powerful as T increases,
and the power will ultimately reach 1 as τ continues to increase.
27

5.2
Simulations for second-order stationarity test
To study the second-order stationarity test, for both S6 and Sym3
+ settings, we simulate locally
stationary manifold time series which are first-order stationary from the model
M3(τ) ∶LogµXi+1 = [0.1 + τ{0.2cos(2πti) + ti ∗(1 −ti)}] ⋅LogµXi + ϵi.
In the 6-dimensional hypersphere S6 case, we set µ = (0,0,0,0,0,0,1) and take {Ej}6
j=1 be an
orthonormal basis of TµM, as the Ej(0) in Setting (i) of Section 5.1. We set εi = ∑6
j=1 Zi,jEj, and
Zi,j
i.i.d
∼Unif(−0.75,0.75). For the Sym3
+-valued time series, µ is set to be I3, and ϵi is generated
in the same way as Section 5.1. When τ = 0, the manifold time series under both settings is
second-order stationary, and when τ > 0, the simulated time series is non-stationary in terms of
the second order. We consider τ = 0.25, 0.5, 0.75, 1.0,1.5 in the power study. We implement
5000 Monte Carlo replications with T = 256, 512, 1024, respectively. The block size is set to be
n = 8 as suggested in van Delft et al. (2021).
Type-I error rates are reported in Table 2. We observe that, under both S6 and Sym3
+ settings,
at the significant level α = 0.05, the Type-I error rates decrease as T increases, but are slightly
inflated for relatively small T. This slight inflation is due to an intrinsic limitation of the method
we adapted (van Delft et al. 2021), which also applies to Euclidean time series. Specifically, we
show in Table 2 that for an AR(0.1) process in R6, the Type-I error rates of this testing procedure
are also slightly larger than 0.05. In our power study, we observe that the testing power for both
S6 and Sym3
+ settings increases to 1 as τ grows; see Figure 3.
28

Figure 3: Simulated power curves for the second-order stationarity test. Left Panel: power curve
for the second-order stationarity test of spherical time series. Right Panel: power curve for the
second-order stationarity test of SPD-matrices-valued time series. The significant level is 0.05.
6
Application to Real Data
In this section, we apply our stationarity test to a single-cell RNA sequencing data generated by
Schiebinger et al. (2019). The raw data is available at NCBI Gene Expression Omnibus (https:
//www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE122662). The goal is to
understand the developmental process of mouse embryonic cells and model the change of cell-type
proportion at each stage. To achieve this goal, scientists first obtained mouse embryonic cells
from a single female embryo, plated cells for 18 days, measured the gene expression profiles of
cells collected across 18 days, and finally profiled 251,203 high-quality cells with 1,479 variable
genes after pre-processing. A nonlinear dimensionality reduction method called force-directed
layout embedding (Jacomy et al. 2014) was used to visualize the temporal change of cellular
populations in 2D in the original work, as shown in Figure 4A. These cells were then assigned to
seven major cell types by clustering and annotation with gene signature scores provided by prior
biological knowledge. The annotated seven cell types are Mouse Embryonic Fibroblasts (MEFs),
Mesenchymal-Epithelial Transition (MET) Cells, Induced Pluripotent Stem (IPS) Cells, Stromal
Cells, Epithelial Cells, Neural Cells and Trophoblasts; each cell type has their own morphological
features and functions. In this study, the proportions of these cell types are observed at each time
29

point, with data collected at 37 time points over the course of 18 days (at 12-hour intervals).
A common approach to model the compositional data is the square-root transformation, which
maps the data to a hypersphere. This transformation has an advantage that the composition
constraint and zero components are naturally incorporated (Stephens 1982, Scealy & Welsh 2011).
Applying square-root transformation to our data, we finally obtain a time series in hypersphere
S6 with length T = 37, with visualization provided in Figure 4B. We aim to answer a biological
question: does the cell-type proportion have systematic change over time, or equivalently, does the
cell-type transition achieve dynamic equilibrium? This question is closely related to the discussion
regarding validity of adopting a dynamic equilibrium assumption for modeling cellular dynamics
without prior knowledge in cell biology (Schiebinger et al. 2019, Zhou et al. 2021, Sha et al. 2024).
Statistically, the question is equivalent to testing the constancy of the mean of this hyperspherical
time series , i.e., the first-order stationarity, and our proposed test serves as a tool to assess the
feasibility of such an assumption when applied to real data.
Specifically, we apply the proposed first-order stationarity test to the data, with bootstrap
sample size B = 2000 and block size selected by the minimum volatility method (Politis et al.
2012). The corresponding p-value is 0.0005, providing a strong evidence to reject the null
hypothesis. Thus, the cell-type proportions in this cell population undergo systematic temporal
change, and cell-type transitions are still out of dynamic equilibrium. This result is consistent with
the findings in Schiebinger et al. (2019), as they discovered that the extracted mouse embryonic
cells have a strong ability of differentiation, and gradually moves to a terminal stromal state or a
MET state, where the latter further generates pluripotent, extra-embryonic, and neural cells.
We then use the same dataset as an illustrative example to evaluate the proposed second-
order stationarity test. In particular, we first estimate the mean curve ˆµ(⋅) using the total-
variation regression with regularization parameters selected by leave-one-out cross validation
30

(Lin & Müller 2021). Then we parallelly transport Logˆµ(i/T)Xi from ˆµ(i/T) to ˆµ(1/T) along
the ˆµ(⋅) as a detrend procedure, and apply the second-order test to the detrend version time-
series {P ˆµ(i/T)
ˆµ(1/T)(Logˆµ(i/T)Xi)}T
i=1. Since the sample size is small, we divide the data into 5
overlapped blocks of size n = 8, which are [1,8],[8,15],[15,22],[22,29] and [30,37]. The
p-value associated to the second-order stationarity test is 0.223. The result shows that there is
no significant evidence suggesting the uncertainty caused by the rate of random proliferation
and apoptosis or noises due to technical issues in the sequencing platform varies over time. The
constant uncertainty was implicitly made as an assumption of the biological model in Schiebinger
et al. (2019) since the uncertainty parameter was shared by all time points in their models and
numerical analysis, and our testing result provides a numerical support for the assumption in this
dataset.
(A)
(B)
Figure 4: (A):Visualization of gene expression profiles of cells using force directed layout
embedding ( a type of nonlinear dimension reduction). This figure is adopted from Schiebinger
et al. (2019), which was originally used to visualize the temporal change of cell populations. In
this visualization, each cell is depicted as a dot. The coloring of these dots corresponds to the
time point at which each cell was sequenced, with darker shades indicating later time points. This
visualization illustrates that the temporal dynamics of the cell population varies continuously
over time, hence is locally stationary. (B): A heatmap to visualize the square-root transformed
cell-type proportion time series data of seven cell types at 37 time points across 18 days. The
square-root transform outputs a spherical time series of length T = 37 in the manifold S6. Each
row corresponds to the square-root of the time-varying proportion of a pxarticular cell type within
the cellular population, while each column denotes the observed value in the manifold time series
at a particular time point. A darker hue signifies a larger proportion.
31

7
Discussion
In this paper, we introduce the definition of first-order and second-order stationarity of manifold-
valued time series. We propose testing methods to test both first-order and second-order stationarity.
Our methods can account for the curved nature of general manifolds. We derive the asymptotic
consistency and asymptotic local powers of the tests. Numerical simulation studies and real data
analysis are provided to illustrate the efficiency of our methods.
One limitation of our work is the dependency of our method for spectral density-based testing
second-order stationarity on the choice of block size, a process that lacks a universally accepted
benchmark and requires further improvement. This issue is not exclusive to our approach but is a
widespread concern in the context of second-order stationarity assessments for time series within
linear spaces (Dette et al. 2011, van Delft et al. 2021).
There are a few interesting future directions of our work. For example, in neuroscience
study, an interesting question is how to detect structural break of dynamic functional connectivity
Hutchison et al. (2013) when the state change. This issue can be approached as a problem of
identifying breakpoints in manifold time series, which can be potentially solved by an extension
of our framework to detect abrupt change in a block-wise locally stationary manifold time series.
Another interesting extension is to generalize our framework and methods to the Wasserstein space
W1([0,1]), since W1([0,1]) can be viewed as an infinite-dimensional Hilbert manifold (Chen
et al. 2023) by proper definition. However, an extension to general metric spaces is challenging
and is still an open question, and we leave it for future research.
Acknowledgment
We thank Robert J. McCann and Zhou Zhou for their insightful conversations and suggestions.
32

S6
Sym+
3
T
CAMB
B1
B2
CAMB
B1
B2
50
0.0364
0.0584
0.3936
0.0384
0.0082
0.0098
100
0.0544
0.1070
0.9806
0.0404
0.0086
0.0254
500
0.0392
0.1008
1.0000
0.0478
0.0098
0.1116
Table 1: Type-I error rates of the first-order stationarity test of three benchmarked methods under
S6 and Sym+
3 scenarios. CAMB represents our method, B1 represents the first comparison method
and B2 represents the second comparison method. The bootstrap size is B = 2000 for all methods
in this study. The results are based on 5000 repetitions of Monte Carlo runs. The significant level
is set to be 0.05.
T
S6
Sym3
+
R6
256
0.090
0.085
0.106
512
0.071
0.073
0.090
1024
0.064
0.070
0.074
Table 2: Type-I errors of second-order stationarity test for T = 256,512,1024 and n = T/8 with
values in sphere, SPD matrices, and Euclidean space, respectively. The results are based on 5000
Monte Carlo runs. The significant level is set to be 0.05.
Supplementary Materials
The supplementary file contains technical proofs for the theorems in this article.
References
Aue, A. & van Delft, A. (2020), ‘Testing for stationarity of functional time series in the frequency
domain’, The Annals of Statistics 48(5), 2505 – 2547.
Bollerslev, T. (1986), ‘Generalized autoregressive conditional heteroskedasticity’, Journal of
Econometrics 31(3), 307–327.
Chen, Y., Lin, Z. & Müller, H.-G. (2023), ‘Wasserstein regression’, Journal of the American
Statistical Association 118(542), 869–882.
33

Dahlhaus, R. (1997), ‘Fitting time series models to nonstationary processes’, The Annals of
Statistics 25(1), 1–37.
Dette, H., Preuß, P. & Vetter, M. (2011), ‘A measure of stationarity in locally stationary processes
with applications to testing’, Journal of the American Statistical Association 106(495), 1113–
1124.
Do Carmo, M. P. (1992), Riemannian geometry, Vol. 6, Springer.
Dubey, P. & Müller, H.-G. (2020), ‘Fréchet change-point detection’, The Annals of Statistics
48(6), 3312–3335.
Fisher, N. I. & Lee, A. J. (1994), ‘Time series analysis of circular data’, Journal of the Royal
Statistical Society. Series B (Methodological) 56(2), 327–339.
Fréchet, M. (1948), ‘Les éléments aléatoires de nature quelconque dans un espace distancié’,
Annales de l’institut Henri Poincaré 10(4), 215–310.
Hutchison, R. M., Womelsdorf, T., Allen, E. A., Bandettini, P. A., Calhoun, V. D., Corbetta,
M., Della Penna, S., Duyn, J. H., Glover, G. H., Gonzalez-Castillo, J. et al. (2013), ‘Dynamic
functional connectivity: promise, issues, and interpretations’, NeuroImage 80, 360–378.
Jacomy, M., Venturini, T., Heymann, S. & Bastian, M. (2014), ‘Forceatlas2, a continuous graph
layout algorithm for handy network visualization designed for the gephi software’, PLOS ONE
9(6), 1–12.
Jiang, F., Zhu, C. & Shao, X. (2024), ‘Two-sample and change-point inference for non-euclidean
valued time series’, Electronic Journal of Statistics 18(1), 848–894.
Lähnemann, D., Köster, J., Szczurek, E., McCarthy, D. J., Hicks, S. C., Robinson, M. D., Vallejos,
34

C. A., Campbell, K. R., Beerenwinkel, N., Mahfouz, A. et al. (2020), ‘Eleven grand challenges
in single-cell data science’, Genome Biology 21(1), 1–35.
Lin, Z. & Müller, H.-G. (2021), ‘Total variation regularized Fréchet regression for metric-space
valued data’, The Annals of Statistics 49(6), 3510 – 3533.
Mardia, K. V., Jupp, P. E. & Mardia, K. (2000), Directional statistics, Vol. 2, Wiley Online Library.
Moakher, M. (2005), ‘A differential geometric approach to the geometric mean of symmetric
positive-definite matrices’, SIAM Journal on Matrix Analysis and Applications 26(3), 735–747.
Page, E. S. (1954), ‘Continuous inspection schemes’, Biometrika 41(1/2), 100–115.
Pennec, X. (2018), ‘Barycentric subspace analysis on manifolds’, The Annals of Statistics
46(6A), 2711–2746.
Petersen, A. & Müller, H.-G. (2019), ‘Fréchet regression for random objects with Euclidean
predictors’, The Annals of Statistics 47(2), 691 – 719.
Politis, D. N., Romano, J. P. & Wolf, M. (2012), Subsampling, Springer Science & Business
Media.
Preuß, P., Vetter, M. & Dette, H. (2013), ‘A test for stationarity based on empirical processes’,
Bernoulli 19(5B), 2715 – 2749.
Priestley, M. B. (1988), ‘Non-linear and non-stationary time series analysis’, London: Academic
Press .
Scealy, J. L. & Welsh, A. H. (2011), ‘Regression for compositional data by using distributions
defined on the hypersphere’, Journal of the Royal Statistical Society. Series B: Statistical
Methodology 73(3), 351–375.
35

Schiebinger, G., Shu, J., Tabaka, M., Cleary, B., Subramanian, V., Solomon, A., Gould, J., Liu,
S., Lin, S., Berube, P. et al. (2019), ‘Optimal-transport analysis of single-cell gene expression
identifies developmental trajectories in reprogramming’, Cell 176(4), 928–943.
Sha, Y., Qiu, Y., Zhou, P. & Nie, Q. (2024), ‘Reconstructing growth and dynamic trajectories from
single-cell transcriptomics data’, Nature Machine Intelligence 6(1), 25–39.
Shao, L., Lin, Z. & Yao, F. (2022), ‘Intrinsic Riemannian functional data analysis for sparse
longitudinal observations’, The Annals of Statistics 50(3), 1696 – 1721.
Shumway, R. H., Stoffer, D. S. & Stoffer, D. S. (2000), Time series analysis and its applications,
Vol. 3, Springer.
Stephens, M. A. (1982), ‘Use of the von mises distribution to analyse continuous proportions’,
Biometrika 69(1), 197–203.
van Delft, A. & Blumberg, A. J. (2024), ‘A statistical framework for analyzing shape in a time
series of random geometric objects’.
van Delft, A., Characiejus, V. & Dette, H. (2021), ‘A nonparametric test for stationarity in
functional time series’, Statistica Sinica 31(3), pp. 1375–1395.
van Delft, A. & Eichler, M. (2018), ‘Locally stationary functional time series’, Electronic Journal
of Statistics 12(1), 107 – 170.
Wang, X., Borsoi, R. A. & Richard, C. (2023), Online change point detection on riemannian
manifolds with karcher mean estimates, in ‘2023 31st European Signal Processing Conference
(EUSIPCO)’, IEEE, pp. 2033–2037.
Wied, D., Krämer, W. & Dehling, H. (2012), ‘Testing for a change in correlation at an unknown
point of time using an extended functional delta method’, Econometric Theory 28(3), 570–589.
36

Wu, W. B. (2005), ‘Nonlinear system theory: Another look at dependence’, Proceedings of the
National Academy of Sciences 102(40), 14150–14154.
Wu, W. B. & Zhou, Z. (2011), ‘Gaussian approximation for non-stationary multiple time series’,
Statistica Sinica 21(3), 1397–1413.
Yang, J., Gohel, S. & Vachha, B. (2020), ‘Current methods and new directions in resting state
fMRI’, Clinical Imaging 65, 47–53.
Zhou, P., Wang, S., Li, T. & Nie, Q. (2021), ‘Dissecting transition cells from single-
cell transcriptome data through multiscale stochastic dynamics’, Nature Communications
12(1), 5609.
Zhou, Z. (2013), ‘Heteroscedasticity and autocorrelation robust structural change detection’,
Journal of the American Statistical Association 108(502), 726–740.
Zhou, Z. & Wu, W. B. (2009), ‘Local linear quantile estimation for nonstationary time series’, The
Annals of Statistics pp. 2696–2729.
Zhu, C. & Müller, H.-G. (2024), ‘Spherical autoregressive models, with application to
distributional and compositional time series’, Journal of Econometrics 239(2), 105389.
37

Supplementary Material for “Stationarity of Manifold
Time Series”
Junhao Zhu, Dehan Kong, Zhaolei Zhang and Zhenhua Lin
S1
Technical Proofs
S1.1
Lemma 1
Proof. We begin by showing that ˆµ is a
√
T-consistent estimator of ˆµ.
Let FT(p) =
1
T
PT
i=1 d2
M(p, Xi). As M satisfies the condition (M1) or (M2), FT is strongly convex in the
sense
FT(p) ≥FT(q) + ⟨1
T
T
X
j=1
LogqXi, Logqp⟩q + λd2
M(p, q)
(S1)
for some constant λ > 0 depending on M only. Let E be an orthonormal frame at TµM,
and ei be the coordinate representation of LogµXi. When conditions (A1)-(A4) hold, by
Proposition 5 in Zhou (2013), then on a richer probability space, there exists i.i.d standard
normal random vectors {Vj}j∈N such that
sup
1≤i≤T

i
X
j=1
1
√
T
ei −
1
√
T
Σ1/2
E ( j
T )Vj
 = op(T −1/2 log2 T),
1

which implies that 1
T
PT
j=1 LogqXi = Op( 1
√
T ). With p = ˆµ and q = µ in Eq.(S1), we have
0 ≥FT(ˆµ) −FT(µ) ≥{Op( 1
√
T
) + dM(µ, ˆµ)} · dM(µ, ˆµ),
whic implies dM(ˆµ, µ) = Op( 1
√
T ).
We next show that
sup
1≤k≤T




H( k
T ) −1
T
k
X
i=1
Hi





µ
= Op( 1
T 1/2).
Since Hi = H(µ, Xi) = H(µ, Expµ(GE(ti, F)⊤
i E)), where E = {Ej}d
j=1 is an orthonormal
frame on TµM. By the assumption (A1), there exists a constant LH such that :
E




H(µ, Expµ(GE(s, F0)
⊤E)) −H(µ, Expµ(GE(t, F0)
⊤E))





HS
≤LHC|t −s|
(S2)
By conditions (A3) and the (A5), there exists a constant ˜α < 1 which only depends on C1
and α such that ⟨H(µ, Expµ(GE(s, F0)⊤E))Ej, Ek⟩also satisfies (A3) for all 1 ≤j ≤k ≤d.
Let {Hi,jk}1≤j≤k≤d be the coordinate representation of H(µ, X) under the basis {E1, · · · , Ed},
i.e. Ai,jk = ⟨HiEj, Ek⟩µ. It suffices to show that
AT,jk := max
1≤l≤T

1
T
l
X
i=1
Hi,jk −1
T
l
X
i=1
EHi,jk
 = OP( 1
T 1/2).
Let Ξl,jk = Pl
i=1(Hi,jk −E[Hi,jk∥Fi−1]) and Λl,jk = Pl
i=1(E[Hi,jk|Fi−1] −E[Hi,jk]). Then
AT,jk can be bounded by
AT,jk = max
1≤l≤T
1
T |Ξl,jk + Λl,jk| ≤max
1≤l≤T
1
T |Ξl,jk| + max
1≤l≤T
1
T |Λl,jk|.
2

Noting that Ξl,jk is a bounded martingale, hence by Doob’s Lp inequality (Durrett 2019) we
have
E[ 1
T max
1≤l≤T |Ξl,jk|2] ≤C
T 2E|ΞT,jk|2 = O( 1
T ).
(S3)
Now We start to bound the quantity max1≤l≤T
1
T |Λi,jk|. Define ΠiY = E[Y |Fi]−E[Y |Fi−1].
Then
max
1≤l≤T |Λl,jk| = max
1≤l≤T |
l
X
i=1
∞
X
a=0
Πi−aHi,jk| = max
1≤l≤T

∞
X
a=0
 
l
X
i=1
Πi−aHi,jk
!
≤
∞
X
a=0
max
1≤l≤T

 
l
X
i=1
Πi−aHi,jk
! .
The triangle inequality implies that
q
E( max
1≤l≤T |Λl,jk|)2 ≤
∞
X
a=0
v
u
u
tE max
1≤l≤T

 
l
X
i=1
Πi−aHi,jk
!
2
.
Note that
Pl
i=1 Πi−aHi,jk

is a martingale. By Doob’s Lp inequality again, there exists a
constant C > 0 such that
q
E( max
1≤l≤T |Λl,jk|)2 ≤C
∞
X
a=0
v
u
u
tE

 T
X
i=1
Πi−aHi,jk
!
2
.
By Theorem 1 in Wu (2005), E

PT
i=1 Πi−aHi,jk

2
= O(T ˜αa) with ˜α < 1, which implies
that
q
E( max
1≤l≤T |Λl,jk|)2 ≤C
√
T
∞
X
a=0
˜αa = O(
√
T).
(S4)
By the upper bounds provided by Eq.(S3) and Eq.(S4) we can deduce that
AT,jk ≤max
1≤l≤T
1
T |Ξl,jk| + max
1≤l≤T
1
T |Λl,jk| = Op( 1
√
T
).
3

Let H(t) =
R t
0 EH(µ, Expµ(GE(s, F0)⊤E))ds. EH(µ, Expµ(GE(s, F0)⊤E)) is uniformly
bounded and Lipschitz continuous in s, so the integral is well-defined. By the property of
Riemann sum, we have:
sup
0≤t≤1




H(t) −1
T
X
i/T≤t
EH
 µ, Expµ(GE(i/T, F0)
⊤E





HS
= O( 1
T ).
Hence, we conclude that:
sup
1≤k≤T




H( k
T ) −1
T
k
X
i=1
Hi





HS
≤sup
1≤k≤T




H( k
T ) −1
T
k
X
i=1
EHi





µ
+ sup
1≤k≤T





1
T
k
X
i=1
Hi −1
T
k
X
i=1
EHi





HS
= OP( 1
T 1/2).
S1.2
Theorem 1
Proof. For simplicity, throughout this proof we use ˜ei to represent LogµXi . The proof is
based on the Taylor expansion of Riemannian log map on manifold (Kendall & Le 2011).
Let γ : [0, 1] be the geodesic from ˆµ = γ(0) to µ = γ(1). By Taylor expansion, we have
sup
1≤k≤T




Pγ(1)
γ(0)( 1
√
T
Sk) −
 1
√
T
k
X
j=1
˜ej −
1
√
T
k
X
j=1
H(µ, Xj) ◦Logµˆµ
	




µ
= Op
 1
√
T

.
Since parallel transport is a linear map and ST = 0, the above equation implies that
√
TLogµˆµ = ( 1
T
PT
j=1 H(µ, Xj))−1 1
√
T
PT
j=1 ˜ej + Op( 1
T ). By Proposition 5 in Zhou (2013),
4

on a richer probability space, there exists a Gaussian process U(t) such that
sup
1≤k≤T





1
√
T
k
X
j=1
˜ej −U( k
T )





µ
= op(T −1/4 log2 T).
The above equation combined with Lemma 1 yields that
sup
1≤k≤T




Pµ
ˆµ( 1
√
T
Sk) −

U( k
T ) −H( k
T ) ◦H−1 ◦(1)U(1)
 




µ
= op(T −1/4 log2 T),
which implies that
sup
1≤k≤T




Pµ
ˆµ( 1
√
T
Sk)





µ
= sup
0≤k≤T




U( k
T ) −H( k
T ) ◦H−1(1) ◦U(1)





µ
+ op(T −1/4 log2 T).
Since parallel transport Pγ(1)
γ(0) is an isometry from Tγ(0)M to Tγ(1)M, we have
sup
1≤k≤T





1
√
T
Sk





ˆµ
= sup
0≤k≤T




U( k
T ) −H( k
T ) ◦H−1(1) ◦U(1)





µ
+ op(T −1/4 log2 T),
which completes the proof.
S1.3
Theorem 2
Proof. To establish the consistency of the debiased bootstrap procedure, we first show that
ˆHk is a consistent estimate of H(k/T). Let γ : [0, 1] →M be the geodesic from ˆµ = γ(0) to
µ = γ(1). Note that H(µ, X) is Lipschitz continuous w.r.t µ by (A1), i.e.,
|⟨Pγ(0)
γ(1)(H(µ, X) ◦Ei), Pγ(0)
γ(1)(Ej)⟩µ −⟨H(ˆµ, X) ◦Pγ(0)
γ(1)(Ei), Pγ(0)
γ(1)(Ej)⟩ˆµ| ≤LHdM(µ, ˆµ).
5

The above inequality implies
|⟨Pγ(0)
γ(1)( 1
T
k
X
j=1
H(µ, Xj) ◦Ei), Pγ(0)
γ(1)(Ej)⟩µ −⟨ˆHk ◦Pγ(0)
γ(1)(Ei), Pγ(0)
γ(1)(Ej)⟩ˆµ|
≤k
T LHdM(µ, ˆµ),
(S5)
and
sup
1≤k≤T
|⟨Pγ(0)
γ(1)( 1
T
k
X
j=1
H(µ, Xj) ◦Ei), Pγ(0)
γ(1)(Ej)⟩µ −⟨ˆHk ◦Pγ(0)
γ(1)(Ei), Pγ(0)
γ(1)(Ej)⟩ˆµ|
≤LHdM(µ, ˆµ) = Op( 1
T 1/2).
By the bound in Lemma 1 that sup1≤k≤T


H( k
T ) −1
T
Pk
i=1 Hi



HS = OP(log2 T
T 1/2 ), we have
sup
1≤k≤T
|⟨Pγ(0)
γ(1)(H(k/T) ◦Ei), Pγ(0)
γ(1)(Ej)⟩µ −⟨ˆHk ◦Pγ(0)
γ(1)(Ei), Pγ(0)
γ(1)(Ej)⟩ˆµ|
= OP( 1
T 1/2).
(S6)
We next show that {Vk,n} consistently mimics the Gaussian process U(t) : 0 ≤t ≤1 up to
an isometry, where Vk,n is a generic version of V (b)
k,n defined in Algorithm 1. Recall vi = LogˆµXi
be the residuals at the tangent space of ˆµ, Sj,n = Pj+n−1
i=j
vi for 1 ≤j ≤T −n + 1, and
V (b)
k,n = Pk
j=1{n(T −n + 1)}−1/2Sj,nRj for k = n, · · · , T −n + 1, where {Rj}j∈N are i.i.d
standard normal random variables. Define
Bi,j =
1
n(T −n + 1)
X
i≤l≤j
Sl,n ⊗Sl,n
and
Bi,j,a,b =
1
n(T −n + 1)
X
i≤l≤j
⟨Sl,n, Pγ(0)
γ(1)Ea⟩ˆµ⟨Sl,n, Pγ(0)
γ(1)Eb⟩ˆµ.
(S7)
6

We first show that
max
n+1≤i≤j≤T−n+1
Bi,j,a,b −
Z j/T
i/T
{ΣE(ξ)}a,bdξ
 = op(1),
where {ΣE(ξ)}a,b is the (a, b)-entry of the matrix ΣE(ξ).
To this end, applying Taylor
expansion of Riemannian log map on TµM yields
1
√nPγ(1)
γ(0)Sl,n =
1
√nPγ(1)
γ(0)
i+n−1
X
k=l
vk =
i+n−1
X
k=l
1
√nPγ(1)
γ(0)vk
=
l+n−1
X
k=l
1
√n˜ek −1
√n
l+n−1
X
k=l
H(µ, Xk) ◦Logµˆµ + op(1/T)
=
1
√n
l+n−1
X
k=l
˜ek −
 
1
n
l+n−1
X
k=l
H(µ, Xk)
!
◦(√nLogµˆµ) + op(1/T).
By the bounded curvature condition of the manifold, H(µ, X) is uniformly bounded, and
thus,
⟨1
√nPγ(1)
γ(0)Sl,n, Ea⟩µ = ⟨1
√nPγ(1)
γ(0)
i+n−1
X
k=l
vk, Ea⟩µ = ⟨
i+n−1
X
k=l
1
√nPγ(1)
γ(0)vk, Ea⟩µ
= ⟨1
√n
l+n−1
X
k=l
ek, Ea⟩µ + Op(n1/2/T 3/2)
−⟨
 
1
n
l+n−1
X
k=l
H(µ, Xk)
!
◦(√nLogµˆµ), Ea⟩µ.
We apply the above expansion of Sl,n to Bi,j,ab and deduce
max
1≤i≤j≤T−n+1
Bi,j,a,b −
1
n(T −n + 1)
X
i≤l≤j
⟨1
√n
l+n−1
X
k=l
ek, Ea⟩µ⟨1
√n
l+n−1
X
k=l
ek, Eb⟩µ
 = Op(
r n
T )
for any 1 ≤a ≤b ≤d. When the conditions (A2)-(A4) hold, by Theorem 4 in Zhou (2013),
7

we have
max
n+1≤i≤j≤T−n+1
Bi,j,a,b −
Z j/T
i/T
{ΣE(ξ)}a,bdξ

≤
max
1≤i≤j≤T−n+1
Bi,j,a,b −
1
n(T −n + 1)
X
i≤l≤j
⟨1
√n
l+n−1
X
k=l
ek, Ea⟩µ⟨1
√n
l+n−1
X
k=l
ek, Eb⟩µ

+
max
n+1≤i≤j≤T−n+1

1
n(T −n + 1)
X
i≤l≤j
⟨1
√n
l+n−1
X
k=l
ek, Ea⟩µ⟨1
√n
l+n−1
X
k=l
ek, Eb⟩µ
−
Z j/T
i/T
{ΣE(ξ)}a,bdξ
 = Op(
r n
T + 1
n).
(S8)
Now we are ready to prove the statement of Theorem 2. Let Un(t) : [ n
T , 1 −n
T ] →TˆµM
be the linear interpolation of Vk,n, i.e., Un(t) = V⌊tT⌋,n + (tT −⌊tT⌋)(V⌊tT⌋+1,n −V⌊tT⌋,n). By
(S8) and Theorem 3 in Zhou (2013), whenever n(T) →∞and T →∞, Pγ(1)
γ(0)Un(t) converges
to U(t) under the uniform topology of C([0, 1], (TµM, ∥· ∥µ)). By (S6) and the uniform
convergence of Un(t), we conclude
max
n≤k≤T−n+1




Vk,n −ˆHk ◦ˆH−1
T ◦VT−n+1,n





ˆµ
D→sup
0≤t≤1




U(t) −H(t) ◦H−1(1) ◦U(1)





µ
.
(S9)
S1.4
Theorem 3
Proof. For simplicity, we write τ(T) as τ. Let ei be the coordinate-representation of LogµT (i/T)
under the frame E(τ, i/T), i.e., LogµT (i/T) = e⊤
i E(τ, i/T); see also (6) in the main text. To
begin with, for a fixed p ∈M, when conditions (M1) or (M2) hold, there exists a constant
8

λ > 0 such that
1
2T
T
X
i=1
 d2
M(p, Xi) −d2
M(µ, Xi)

≥⟨1
T
T
X
i=1
LogµXi, Logµp⟩µ + λd2
M(p, µ).
Taylor expansion of vector fields on manifold yields
LogµXi = Pµ
µT (i/T)

e
⊤
i E(τ, i
T )

−τH(µ, Xi) ◦b( i
T ) + Op(τ 2),
and
1
2T
T
X
i=1
 d2
M(p, Xi) −d2
M(µ, Xi)

≥⟨1
T
T
X
i=1
LogµXi, Logµp⟩µ + λd2
M(p, µ)
= 1
2T
T
X
i=1
⟨e
⊤
i E(τ, i
T ), PµT (i/T)
µ
Logµp⟩µ −τ⟨1
T
T
X
i=1
H(µ, Xi) ◦b( i
T ), Logµp⟩µ
+ λd2
M(p, µ) + Op(τ 2).
By the bounded conditions on the curvature and the curve b(·), there exists a constant Cb
such that




τ⟨1
T
PT
i=1 H(µ, Xi) ◦b( i
T )





µ
≤τCb. Let µp be the coordinate representation of
Logµp under the orthonormal basis {Ej, 1 ≤j ≤d} and {Rτ,T,i : 1 ≤i ≤T} be a collection
of d × d matrices such that
(Rτ,T,i)j,k = ⟨Ej(τ, i
T ), PµT (i/T)
µ
Ek)⟩µT (i/T).
Then ⟨e⊤
i E(τ, i
T ), PµT (i/T)
µ
Logµp⟩µT (i/T) can be rewritten as e⊤
i Rτ,T,iµp. Moreover, the local
isometry property of parallel transport on smooth manifold implies Rτ,T,i = Id + O( τ
T ), and
hence
1
2T
T
X
i=1
 d2
M(p, Xi) −d2
M(µ, Xi)

≥1
2T
T
X
i=1
e
⊤
i µp + λd2
M(p, µ) + Op(τ).
(S10)
9

Since
 1
2T
PT
i=1 e⊤
i µp
 ≤∥1
2T
PT
i=1 ei∥· d(µ, p), by taking p = ˆµ, we deduce from (S10) that
dM(ˆµ, µ) = Op(max{τ,
p
1/T}).
Now we show that when limT→∞
τ
T −1/2 = ∞, the CUSUM statistic QT →∞almost
surely. Let γ : [0, 1] →M be the geodesic from ˆµ = γ(0) to µ = γ(1). Similarly to the proof
of Theorem 1, we can show that
sup
1≤k≤T




Pγ(1)
γ(0)( 1
√
T
Sk) −
 1
√
T
k
X
j=1
˜ej −
1
√
T
k
X
j=1
H(µ, Xj) ◦Logµˆµ





µ
= Op(τ 2),
and that
sup
1≤k≤T




Pγ(1)
γ(0)( 1
√
T
Sk)





µ
= sup
1≤k≤T





1
√
T
k
X
j=1
˜ej −
1
√
T
k
X
j=1
H(µ, Xj) ◦Logµˆµ





µ
+ Op(τ 2)
≥−sup
1≤k≤T





1
√
T
k
X
j=1
˜ej





µ
+
√
T





 
1
T
T
X
j=1
H(µ, Xj)
!
◦Logµˆµ





µ
+ Op(τ 2).
Note that sup1≤k≤T





1
√
T
Pk
j=1 ˜ej





µ
= Op(1) and that






1
T
PT
j=1 H(µ, Xj)

◦Logµˆµ





µ
≥
λdM(ˆµ, µ) ≍Op(τ) when condtions (M1) and (M2) hold. Thus, we have
QT = sup
1≤k≤T




Pγ(1)
γ(0)( 1
√
T
Sk)





µ
≥−sup
1≤k≤T





1
√
T
k
X
j=1
˜ej





µ
+ λ
√
TdM(µ, ˆµ) + Op(τ 2) →+∞, a.e.,
whenever lim
T→∞
τ
√
T →∞.
10

We next show that, if τ(T) = T −1/2, then
QT
D→sup
0≤t≤1




U(t) −H(t) ◦H−1(1) ◦U(1)
+ H(t) ◦H−1(1) ◦
Z 1
0
∂
∂ξH(ξ) ◦v(ξ)dξ −
Z t
0
∂
∂ξH(ξ) ◦v(ξ)dξ





µ
,
(S11)
where H(t) is defined in Lemma 1. For convenience, we define Π(s, t) be the parallel transport
map from γ(s, t) to µ along the geodesic s →γ(s, t). Notice that P
i LogˆµXi = 0, and that
dM(ˆµ, µ) = Op(T −1/2), we can apply Taylor expansion at ˆµ similarly to (S10) for τ = T −1/2
and have
0 =
T
X
i=1
Π(τ, i
T )

e
⊤
i E(τ, i
T )

−τ
T
T
X
i=1
H(µ, Xi) ◦b( i
T ) −1
T
T
X
i=1
H(µ, Xi)Logµˆµ + Op( 1
T ).
The above identity implies that
Logµˆµ =
 
1
T
T
X
i=1
H(µ, Xi)
!−1
◦
 T
X
i=1
Π(τ, i
T )

e
⊤
i E(τ, i
T )

−τ
T
T
X
i=1
H(µ, Xi) ◦b( i
T )
!
+ Op( 1
T ).
(S12)
Substituting Logµˆµ by (S12) in the following equation
sup
1≤k≤T




Pγ(1)
γ(0)( 1
√
T
Sk) −
 1
√
T
k
X
j=1
˜ej −
1
√
T
k
X
j=1
H(µ, Xj) ◦Logµˆµ





µ
= Op( 1
√
T
),
11

we have
sup
1≤k≤T




Pγ(1)
γ(0)( 1
√
T
Sk)





µ
=
1
√
T
sup
1≤k≤T





k
X
j=1
˜ej −
k
X
j=1
H(µ, Xj) ◦Logµˆµ





µ
+ Op( 1
T )
=
1
√
T
sup
1≤k≤T




(
k
X
j=1
˜ej −
k
X
j=1
H(µ, Xj) ◦
 
1
T
T
X
i=1
H(µ, Xi)
!−1
◦
 T
X
i=1
Π(τ, i
T )

e
⊤
i E(τ, i
T )

−τ
T
T
X
i=1
H(µ, Xi) ◦b( i
T )
! 




µ
+ Op( 1
√
T
).
By Taylor expansion of ˜ej, i.e.,
˜ej = Π(τ, j
T )

e
⊤
j E(τ, j
T )

−τH(µ, Xj) ◦b( i
T ) + Op( 1
√
T
),
the test statistic QT = sup1≤k≤T




Pγ(1)
γ(0)( 1
√
T Sk)





µ
could be further represented by
sup
1≤k≤T




Pγ(1)
γ(0)( 1
√
T
Sk)





µ
=
1
√
T
sup
1≤k≤T





k
X
j=1
˜ej −
k
X
j=1
H(µ, Xj) ◦Logµˆµ





µ
+ Op( 1
T )
=
1
√
T
sup
1≤k≤T





k
X
j=1

Π(τ, j
T )

e
⊤
j E(τ, j
T )

−τH(µ, Xj) ◦b( i
T )

−
k
X
j=1
H(µ, Xj) ◦
 
1
T
T
X
i=1
H(µ, Xi)
!−1
◦
 T
X
i=1
Π(τ, i
T )

e
⊤
i E(τ, i
T )

−τ
T
T
X
i=1
H(µ, Xi) ◦b( i
T )
! 




µ
+ Op( 1
√
T
).
(S13)
To show our desired asymptotic results, it suffices to prove the following claims:
(C1)
1
√
T
Pk
j=1 Π(τ, j
T )
 e⊤
j E(τ, j
T )
 D→U( k
T ) as T →∞;
12

(C2) max1≤k≤T





1
T
Pk
i=1 H(µ, Xi) −H( k
T )




 = Op(1/
√
T);
(C3) max1≤k≤T





1
T
Pk
i=1 H(µ, Xi) ◦b( i
T ) −
R k/T
0
∂
∂ξH(ξ) ◦b(ξ)dξ




 = Op(1/
√
T).
Now we prove the claim (C1). Given τ = 1/
√
T, Taylor expansion of parallel transport
yields that
Π(τ, j
T )

e
⊤
j E(τ, j
T )

=
d
X
l=1
ej,lΠ(τ, j
T )

El(τ, j
T )

= e
⊤
j E +
1
√
T
d
X
l=1
ej,l

∇b(j/T)El(s, j
T )
 
s=0
+ Op( 1
T ),
which implies that
max
1≤k≤T





1
√
T
k
X
j=1

Π(τ, j
T )

e
⊤
j E(τ, j
T )

−
e
⊤
j E −
1
√
T
d
X
l=1
ej,l

∇b(j/T)El(s, j
T )
 
s=0
# 




µ
= Op( 1
√
T
).
For fixed l and T, define
Ξk,l = 1
T
k
X
j=1
ej,l

∇b(j/T)El(s, j
T )
 
s=0
, 1 ≤k ≤T.
Then {Ξk,l}T
k=1 is a L2- martingale since ∇b(j/T)El(s, j
T ) is uniformly bounded whenever b(·)
is bounded continuous. By the Doob’s inequality, we have E max1≤k≤T


Ξk,l


2
µ ≤CE


Ξk,l


2
µ
for some constant C. Similar to proof of Lemma 1, E


Ξk,l


2
µ = Op( 1
T ) and thus ensuring
13

that E max1≤k≤T


Ξk,l



µ = O( 1
√
T ) and that
max
1≤k≤T





1
√
T
k
X
j=1

Π(τ, j
T )

e
⊤
j E(τ, j
T )

−e
⊤
j E
 




µ
= Op( 1
√
T
).
Applying the weak convergence of
1
√
T
Pk
j=1 e⊤
j E to U(·) provided by Lemma 1, we conclude
that
1
√
T
Pk
j=1 Π(τ, j
T )
 e⊤
j E(τ, j
T )

converges to U(·), as claimed.
To prove the Claim (C2), it suffices to show that




EH(µ, Xi) −EH

µ, Expµ(GE( i
T , F0)
⊤E





HS
≤C/
√
T,
for some constant C, since the physical dependency meansure is the same as Theorem 1. Let
J (t, ei) be the Jacobi field along the geodesic t →Expµ(tτb( i
T )) such that J (0, ei) = τb( i
T )
and
∂
∂tJ (t, ei)

t=0
= τ
d
X
l=1
ei,l∇b( i
T )El(s, i
T )

s=0
.
By the Lipschitz condition of the Hessian tensor and the sub-exponential condition of
GE( i
T , F0), we have




EH(µ, Xi) −EH

µ, Expµ(GE( i
T , F0)
⊤E





HS
≤LHEdM

Expµ

GE( i
T , F0)
⊤E
	
, ExpµT (i/T)

GE( i
T , F0)
⊤E(τ, i
T )
	
≤LHE




J (1, ei)





Expµ(τb( i
T ))
≤τCLH, (by Gronwall’s inequality)(Taylor 2010)
(S14)
for some constant C < ∞and τ = 1/
√
T.
Claim (C3) holds by applying similar argument to H(µ, Xi)◦b( i
T ) instead of H(µ, Xi). By
14

applying results of Claims (C1)-(C3) to (S13), we obtain our desired asymptotic convergence
in (S11).
S1.5
Theorem 4
Proof. To prove the consistency of the bootstrap procedure, it suffices to show that the
estimated covariance function is consistent. The argument is similar to the proof of Theorem
2. Applying Taylor expansion twice, we have
1
√nPγ(1)
γ(0)Sl,n =
1
√n
l+n
X
j=l+1
Π(τ, j
T )

e
⊤
j E(τ, j
T )

+ τ
√n
l+n
X
j=l+1
H(µ, Xj) ◦b( j
T )
−1
√n
l+n
X
j=l+1
H(µ, Xj) ◦Logµˆµ + op(τ√n).
Moreover, Taylor expansion of parallel transport yields that
Π(τ, j
T )

e
⊤
j E(τ, j
T )

=
d
X
l=1
ej,kΠ(τ, j
T )

Ek(τ, j
T )

= e
⊤
j E + τ
d
X
l=1
ej,k

∇b(j/T)El(s, j
T )
 
s=0
+ Op(τ 2),
15

and that
1
√nPγ(1)
γ(0)Sl,n =
1
√n
l+n
X
j=l+1
 
e
⊤
j E + τ
d
X
l=1
ej,k

∇b(j/T)El(s, j
T )
 
s=0
!
+ τ
√n
l+n
X
j=l+1
H(µ, Xj) ◦b( j
T )
−1
√n
l+n
X
j=l+1
H(µ, Xj) ◦Logµˆµ + op(τ√n).
If limT→∞
√nτ →0 and n →∞, we have
1
√nPγ(1)
γ(0)Sl,n =
1
√n
l+n
X
j=l+1
 e
⊤
j E

+ Op(τ√n),
which implies that
max
n+1≤i≤j≤T−n+1
Bi,j,a,b −
Z j/T
i/T
{ΣE(ξ)}a,bdξ
 = op(1),
(S15)
where Bi,j,a,b is defined in (S7).
We then show that { 1
T
Pk
i=1 H(ˆµ, Xi)}T
k=1 consistently estimates H(·). Since convergence
of { 1
T
Pk
i=1 H(ˆµ, Xi)}T
k=1 to { 1
T
Pk
i=1 H(µ, Xi)}T
k=1 with rate Op(max{τ, 1/
√
T}) is guaranteed
by (S5), it suffices to show that { 1
T
Pk
i=1 H(µ, Xi)}T
k=1 converges to H(·).
H(µ, Xi) −
EH(µ, Expµ(GE( i
T , F0)⊤E) can be rewritten as
H(µ, Xi)−EH

µ, Expµ(GE( i
T , F0)
⊤E

= H(µ, Xi) −EH(µ, Xi)
+ EH(µ, Xi) −EH

µ, Expµ(GE( i
T , F0)
⊤E

.
16

By (S14), the bias term EH(µ, Xi) + EH
 µ, Xi) −EH

µ, Expµ(GE( i
T , F0)⊤E

is bounded
by O(τ). Similarly to the argument in the proof for Lemma 1, we have
max
1≤k≤T





1
T
k
X
i=1
H(µ, Xi)−EH(µ, Xi)





2
HS
= Op( 1
T ).
Hence, we can deduce that
sup
1≤k≤T
|⟨Pγ(0)
γ(1)(H(k/T) ◦Ei), Pγ(0)
γ(1)(Ej)⟩µ −⟨ˆHk ◦Pγ(0)
γ(1)(Ei), Pγ(0)
γ(1)(Ej)⟩ˆµ|
= OP(max{τ,
1
T 1/2}).
(S16)
Let Un(t) : [ n
T , 1 −n
T ] →TˆµM be the piece-wise linear interpolation of Vk,n, i.e., Un(t) =
V⌊tT⌋,n + (tT −⌊tT⌋)(V⌊tT⌋+1,n −V⌊tT⌋,n). Combining the results given by (S15) and (S16),
conditioned on {Xi}T
i=1, Un(t) converges to U(t) on C([0, 1], (TµM, ∥· ∥µ)) with the uniform
topology.
S1.6
Theorem 5
To establish the asymptotic result, we follow the arguments in Aue & van Delft (2020) and
van Delft et al. (2021). Different from the Hilbert space, we need to investigate the curvature
effect in the general Riemannian manifold. When M is a Riemannian manifold, we have to
account for the curvature effect in Jn:
Pγ(1)
γ(0)Jn(ω, t) = Pγ(1)
γ(0)
1
√
2πn
n−1
X
s=0
LogˆµX⌊tT⌋−n/2+1+s,T · e−isω
=
1
√
2πn
n−1
X
s=0
LogµX⌊tT⌋−n/2+1+s,T · e−isω
17

−
1
√
2πn
n−1
X
s=0
H(µ, X⌊tT⌋−n/2+1+s,T) ◦Logµˆµ · e−isω + OP(
√n
T ).
Define ˜Pγ(1)
γ(0)(U ⊗V ) = Pγ(1)
γ(0)(U) ⊗Pγ(1)
γ(0)(V ), then we have
˜Pγ(1)
γ(0)In(ω, u) = ( 1
2πn
n−1
X
s=0
LogµX⌊tT⌋−n/2+1+s,T · e−isω) ⊗(
n−1
X
r=0
LogµX⌊tT⌋−n/2+1+r,T · e−irω)
+
 1
2πn
n−1
X
s=0
H(µ, X⌊tT⌋−n/2+1+s,T) ◦Logµˆµ · e−isω

⊗(
N−1
X
r=0
LogµX⌊tT⌋−n/2+1+r,T · e−irω)
+ ( 1
2πn
n−1
X
r=0
LogµX⌊tT⌋−n/2+1+r,T · e−irω) ⊗
 n−1
X
s=0
H(µ, X⌊tT⌋−n/2+1+s,T) ◦Logµˆµ · e−isω

+
 1
2πn
n−1
X
s=0
H(µ, X⌊tT⌋−n/2+1+s,T) ◦Logµˆµ · e−isω

⊗
 n−1
X
r=0
H(µ, X⌊tT⌋−n/2+1+r,T) ◦Logµˆµ · e−irω

+ OP(
√n
T ).
Let
An(ω, t) = ( 1
2πn
n−1
X
s=0
LogµX⌊tT⌋−n/2+1+s,T · e−isω) ⊗(
n−1
X
r=0
LogµX⌊tT⌋−n/2+1+r,T · e−irω),
Bn(ω, t) =
 1
2πn
n−1
X
s=0
H(µ, X⌊tT⌋−n/2+1+s,T) ◦Logµˆµ · e−isω

⊗(
n−1
X
r=0
LogµX⌊tT⌋−n/2+1+r,T · e−irω),
Cn(ω, t) =
 1
2πn
n−1
X
s=0
H(µ, X⌊tT⌋−n/2+1+s,T) ◦Logµˆµ · e−isω

⊗
 n−1
X
r=0
H(µ, X⌊tT⌋−n/2+1+r,T) ◦Logµˆµ · e−irω

.
18

The asymptotic behaviour of An(ω, t) is well-studied in the work by Aue & van Delft (2020)
and van Delft et al. (2021). We then investigate the property of Bn(ω, t). Notice that Bn(ω, t)
is multi-linear in Pn−1
s=0 H(µ, X⌊tT⌋−n/2+1+s,T) · e−isω, Logµˆµ and (Pn−1
r=0 LogµX⌊tT⌋−n/2+1+r,T ·
e−irω), where the asymptotic distribution of the latter two terms are known. Thus, we only
need to identify the asymptotic behaviour of Pn−1
s=0 H(µ, X⌊tT⌋−n/2+1+s,T) · e−isω. We rewrite
Pn−1
s=0 H(µ, X⌊tT⌋−n/2+1+s,T) · e−isω as
n−1
X
s=0
H(µ, X⌊tT⌋−n/2+1+s,T) · e−isω
=
n−1
X
s=0

H(µ, X⌊tT⌋−n/2+1+s,T) −EH(µ, X⌊tT⌋−n/2+1+s,T)

· e−isω
+
n−1
X
s=0
EH(µ, X⌊tT⌋−n/2+1+s,T) · e−isω.
We claim:
(C4) Pn−1
s=0 EH(µ, X⌊tT⌋−n/2+1+s,T) · e−isω = O(n2/T), for ω = 2πk/n, k = 1, · · · , n;
(C5) Pn−1
s=0
 H(µ, X⌊tT⌋−n/2+1+s,T) −EH(µ, X⌊tT⌋−n/2+1+s,T)

· e−isω = Op(√n).
To prove the claim (C4), without loss of generality, we consider t =
n
2T , and rewrite
Pn
s=1 EH(µ, X⌊tT⌋−n/2+1+s,T) · e−isω as follows:
n−1
X
s=0
EH(µ, X⌊tT⌋−n/2+1+s,T) · e−isω =
n
X
s=1
EH(µ, Expµ(GE(s + 1
T
, F0)
⊤E)) · e−isω
=
n−1
X
s=0

EH(µ, Expµ(GE(1 + s
T
, F0)
⊤E)) −EH(µ, Expµ(GE( n
2T , F0)
⊤E))

· e−isω
+ EH(µ, Expµ(GE( n
2T , F0)
⊤E)) ·
n−1
X
s=0
e−isω.
19

Notice that when ω = 2πk/n, k = 1, · · · , n, Pn−1
s=0 e−isω = 0 by the property of discrete
Fourier transform. By the Lipschitz continuity provided by Eq.(S2), we have
n−1
X
s=0

EH
 µ, Expµ(GE(1 + s
T
, F0)
⊤E)

−EH
 µ, Expµ(GE( n
2T , F0)
⊤E)

· e−isω
≤
n−1
X
s=0
CLH|s + 1 −n/2
T
| = O(n2
T ).
We then apply the martingale techniques to prove the Claim (C5). Let {Hi,jk}1≤j≤k≤d be the
coordinate representation of H(µ, X) under the basis {E1, · · · , Ed}, i.e. Hi,jk = ⟨HiEj, Ek⟩µ.
It suffices to show that
n−1
X
s=0
(Hs+1,jk −EHs+1,jk) · e−isω = Op(√n).
We consider the real part of Pn−1
s=0 (Hs+1,jk −EHs+1,jk) · e−isω. Let Ξl,jk = Pl−1
s=0(Hs+1,jk −
E[Hs+1,jk|Fs]) cos(sω) and Vl,jk = Pl−1
s=0(E[Hs+1,jk|Fs] −E[Hs+1,jk]) cos(sω). {Ξl,jk}n
l=1 is a
bounded martingale, and by elementary calculation we have E|Ξn,jk|2 = O(n). Similar to
argument in proof of Lemma 1, we apply Theorem 1 in Wu (2005) and have E|Vn,jk|2 = O(n).
The imaginary part is also bounded by O(n) by similar argument, and thus (S1.6) is proved.
Claims (C4) and (C5) indicate that 1
n
Pn−1
s=0 H(µ, X⌊tT⌋−n/2+1+s,T)·e−isω = Op( 1
√n + n
T ), and
that Bn(ω, u) = Op( n3/2
T 3/2 +
1
T 1/2), which may not converge to 0 in the test statistic scaled by
√
T. Similarly, we have Cn(ω, t) = Op( 1
T + n3
T 3 + n3/2
T 2 ) = op( 1
√
T ), which is negligible in the
test statistic as T →∞. Denote ¯Hi = Hi −EHi the centered version of Hi. Then
Bn(ω, t) = Op( n3/2
T 3/2)+
( 1
2πn
n−1
X
s=0
¯H⌊tT⌋−n/2+1+s,T ◦Logµˆµ · e−isω) ⊗(
n−1
X
r=0
LogµX⌊tT⌋−n/2+1+r,T · e−irω).
20

Define FH,G(λ, t) the cross-spectral density function of GE(t, F0)⊤E and
H(µ, Expµ(GE(t, F0)⊤E))−EH(µ, Expµ(GE(t, F0)⊤E)), i.e., ∀v ∈TµM, we have FH,G(ω, t)(v) =
1
2π
P
h∈Z E[ ¯H(t, Fh)◦v ⊗{GE(t, F0)⊤E}]eiωh. Then Bn(ω, t) intuitively serves as an estimate
of FH,G(ω, t)(Logµˆµ).
We observe that
√
T
T
m
X
j=1
n/2
X
k=1
(⟨In(ωk, tj), In(ωk−1, tj)⟩−⟨An(ωk, tj), An(ωk−1, tj)⟩)
=
√
T
T
m
X
j=1
n/2
X
k=1
⟨An(ωk, tj), Bn(ωk−1, tj)⟩
+
√
T
T
m
X
j=1
n/2
X
k=1
⟨Bn(ωk, tj), An(ωk−1, tj)⟩
+
√
T
T
m
X
j=1
n/2
X
k=1
⟨Bn(ωk, tj), Bn(ωk−1, tj)⟩+ op(1/
√
T).
Since Bn(ωk, tj) = Op(1/
√
T), we have
√
T
T
Pm
j=1
Pn/2
k=1⟨Bn(ωk, tj), Bn(ωk−1, tj) = Op(1/
√
T).
For the quantity
√
T⟨An(ωk, tj), Bn(ωk−1, tj)⟩, Taylor expansion yields that
√
T⟨An(ωk, tj), Bn(ωk−1, tj)⟩= ⟨Jn(−ωk, tj), Jn(ωk−1, tj)⟩µ
× ⟨Jn(ωk, tj), J
¯H
n (−ωk−1, tj) ◦UT)⟩µ + Op( 1
T ),
where
J
¯H
n (−ω, tj) =
1
√
2πn
n−1
X
s=0
e−isω ¯H⌊tT⌋−n/2+1+s,T ◦H−1(1),
and
UT =
1
√
T
T
X
j′=1
e
⊤
j′E
21

is the local DFT of the tensor ¯Hs ◦H−1(1). Under the orthonormal basis E = {E1, · · · , Ed}
on TµM, we can rewrite ⟨Jn(−ωk, tj), Jn(ωk−1, tj)⟩µ × ⟨Jn(ωk, tj), J ¯H
n (−ωk−1, tj) ◦UT)⟩µ as
⟨Jn(−ωk, tj), Jn(ωk−1, tj)⟩µ × ⟨Jn(ωk, tj), J
¯H
n (−ωk−1, tj) ◦UT)⟩µ
=
d
X
l3=1
⟨UT, El3⟩µ
d
X
l1=1
d
X
l2=1
⟨Jn(−ωk, tj), El1⟩µ⟨Jn(ωk−1, tj), El1⟩µ
× ⟨Jn(ωk, tj), El2⟩µ⟨El2, J
¯H
n (−ωk−1, tj) ◦El3⟩µ.
We apply the similar expansion to
√
T
n
Pn/2
k=1⟨1
m
Pm
j=1 In(ωk, tj), 1
m
Pm
j=1 In(ωk, tj)⟩. Similarly,
we have
√
T
n
n/2
X
k=1
⟨1
m
m
X
j=1
In(ωk, tj), 1
m
m
X
j=1
In(ωk, tj)⟩−
√
T
n
n/2
X
k=1
⟨1
m
m
X
j=1
An(ωk, tj), 1
m
m
X
j=1
An(ωk, tj)⟩,
=
√
T
n
n/2
X
k=1
⟨1
m
m
X
j=1
Bn(ωk, tj), 1
m
m
X
j=1
An(ωk, tj)⟩
+
√
T
n
n/2
X
k=1
⟨1
m
m
X
j=1
An(ωk, tj), 1
m
m
X
j=1
Bn(ωk, tj)⟩+ Op(1/
√
T).
(S17)
Note that
√
T
n
Pn/2
k=1⟨1
m
Pm
j=1 Bn(ωk, tj), 1
m
Pm
j=1 An(ωk, tj)⟩is asymptotically equivalent to
√
T
n
n/2
X
k=1
⟨1
m
m
X
j=1
An(ωk, tj), 1
m
m
X
j=1
Bn(ωk, tj)⟩
= 1
n
n/2
X
k=1
⟨1
m
m
X
j=1
An(ωk, tj), 1
m
m
X
j=1
J
¯H
n (−ωk, tj) ◦UT ⊗Jn(ωk, tj)⟩+ Op(1/
√
T).
22

Extracting the common linear factor UT above, we have
⟨1
m
m
X
j=1
An(ωk, tj), 1
m
m
X
j=1
J
¯H
n (−ωk, tj) ◦UT ⊗Jn(−ωk, tj)⟩
=
d
X
l3=1
⟨UT, El3⟩µ
1
m2
d
X
l1=1
d
X
l2=1
m
X
j1=1
m
X
j2=1
⟨Jn(−ωk, tj1), El1⟩µ⟨Jn(ωk, tj2), El1⟩µ
× ⟨Jn(ωk, tj1), El2⟩µ⟨El2, J
¯H
n (−ωk, tj2) ◦El3⟩µ
=
d
X
l3=1
d
X
l1=1
d
X
l2=1
⟨UT, El3⟩µ
 
1
m
m
X
j1=1
⟨Jn(−ωk, tj1), El1⟩µ⟨Jn(ωk, tj1), El2⟩µ
!
×
 
1
m
m
X
j2=1
⟨Jn(ωk, tj2), El1⟩µJ
¯H
n (−ωk, tj2) ◦El3⟩µ
!
Hence, the additional term induced by the curvatures of manifold compared to Euclidean
space is asymptotically equivalent to ∆+ ¯∆, where ∆is given by
∆=
d
X
l3=1
d
X
l1=1
d
X
l2=1
⟨UT, El3⟩µ
(
1
T
n/2
X
k=1
m
X
j=1
⟨Jn(−ωk, tj), El1⟩µ⟨Jn(ωk, tj), El2⟩µ
× ⟨Jn(ωk−1, tj), El1⟩µ⟨El2, J
¯H
n (−ωk−1, tj) ◦El3⟩µ
−
n/2
X
k=1
 
1
m
m
X
j1=1
⟨Jn(−ωk, tj1), El1⟩µ⟨Jn(ωk, tj1), El2⟩µ
!
×
 
1
m
m
X
j2=1
⟨Jn(ωk, tj2), El1⟩µJ
¯H
n (−ωk, tj2) ◦El3⟩µ
! )
,
and ¯∆is the complex conjugate of ∆.
Denote F GG
l1,l2(u, w) the cross spectral density of
⟨GE(u, F), El1⟩µ and ⟨GE(u, F), El2⟩µ, and F GH
l1,l2,l3(u, w) the cross spectral density of ⟨GE(u, F), El1⟩µ
23

and ⟨El2, H(µ, Expµ(GE(u, F)) ◦El3⟩µ. We claim that:
∆l1,l2,l3 =
(
1
T
n/2
X
k=1
m
X
j=1
⟨Jn(−ωk, tj), El1⟩µ⟨Jn(ωk, tj), El2⟩µ
× ⟨Jn(ωk−1, tj), El1⟩µ⟨El2, J
¯H
n (−ωk−1, tj) ◦El3⟩µ
−1
n
n/2
X
k=1
 
1
m
m
X
j1=1
⟨Jn(−ωk, tj1), El1⟩µ⟨Jn(ωk, tj1), El2⟩µ
!
×
 
1
m
m
X
j2=1
⟨Jn(ωk, tj2), El1⟩µJ
¯H
n (−ωk, tj2) ◦El3⟩µ
! )
= 1
4π
Z π
−π
Z 1
0
F GG
l1,l2(u, w)F GH
l1,l2,l3(u, w)dudω
−1
4π
Z π
−π
Z 1
0
F GG
l1,l2(u, w)du
 Z 1
0
F GH
l1,l2,l3(u, w)du

dω + Op( 1
m),
(S18)
which indicates that whenever GE(u, F) is independent of u, ∆+ ¯∆= Op( 1
m), and the
asymptotic distribution of the test statistics V 2
ˆF only depends on An(u, ω), as per the
distribution in Euclidean space or Hilbert space; and under H1 the asymptotic distribution
will differ from the counterpart in Euclidean or Hilbert space in the asymptotic variance. In
the remainder of this section we apply the cumulant techniques to show the claim given by
(S18) holds.
For a complex random variable X, we define ¯X as the complex conjugate of X. Denote
cumk(·) the kth-order joint cumulant of random variables X1, · · · , Xk, which is given by the
coefficient of ik(t1 · · · tk) in the complex Taylor expansions of log E[ei Pk
j tjXj] at the origin
(Brillinger 2001). Particularly, cum1(X) = EX and cum2({X, ¯Y }) = E(XY )−EXEY . For
comprehensive details of cumulants, we refer to the book by Brillinger (2001).
We mainly follow the discussions by van Delft et al. (2021).
For simplicity, we use
Dn,l(ωk, tj) to represent ⟨Jn(ωk, tj), El⟩µ and DH
n,l,l′(ωk, tj) to represent ⟨El, J ¯H
n (−ωk−1, tj) ◦
24

El′⟩µ. Then by the expansion of cumulant (Brillinger 2001), Lemma S4.1 and Corollary S4.1
in van Delft et al. (2021), we have
E 1
T
n/2
X
k=1
m
X
j=1
⟨Jn(−ωk, tj), El1⟩µ⟨Jn(ωk, tj), El2⟩µ
× ⟨Jn(ωk−1, tj), El1⟩µ⟨El2, J
¯H
n (−ωk−1, tj) ◦El3⟩µ
= 1
T
n/2
X
k=1
m
X
j=1
cum1(Dn,l1(−ωk, tj)Dn,l2(ωk, tj)Dn,l1(ωk−1, tj)DH
n,l2,l3(ωk−1, tj))
= 1
T
n/2
X
k=1
m
X
j=1
cum4(Dn,l1(−ωk, tj), Dn,l2(ωk, tj), Dn,l1(ωk−1, tj), DH
n,l2,l3(−ωk−1, tj))
+ 1
T
n/2
X
k=1
m
X
j=1
cum2(Dn,l1(−ωk, tj), Dn,l2(ωk, tj))cum2(Dn,l1(ωk−1, tj), DH
n,l2,l3(−ωk−1, tj))
+ 1
T
n/2
X
k=1
m
X
j=1
cum2(Dn,l1(−ωk, tj), Dn,l1(−ωk−1, tj))cum2(Dn,l2(ωk, tj), DH
n,l2,l3(ωk−1, tj))
+ 1
T
n/2
X
k=1
m
X
j=1
cum2(Dn,l1(−ωk, tj), DH
n,l2,l3(−ωk−1, tj))cum2(Dn,l2(ωk, tj), Dn,l1(−ωk−1, tj))
= 1
T
n/2
X
k=1
m
X
j=1
cum2(Dn,l1(−ωk, tj), Dn,l2(ωk, tj))cum2(Dn,l1(ωk−1, tj), DH
n,l2,l3(−ωk−1, tj))
+ O( 1
T ) + O( 1
m2) + O(1
n)
= 1
4π
Z π
−π
Z 1
0
F GG
l1,l2(u, w)F GH
l1,l2,l3(u, w)dudω + O( 1
m2) + O(1
n) + O( 1
T ).
25

Similarly, we have
E1
n
n/2
X
k=1
 
1
m
m
X
j1=1
⟨Jn(−ωk, tj1), El1⟩µ⟨Jn(ωk, tj1), El2⟩µ
!
×
 
1
m
m
X
j2=1
⟨Jn(ωk, tj2), El1⟩µJ
¯H
n (−ωk, tj2) ◦El3⟩µ
!
=
1
mT
n/2
X
k=1
m
X
j1=1
m
X
j2=1
cum1(Dn,l1(−ωk, tj1)Dn,l2(ωk, tj1)Dn,l1(ωk, tj2)DH
n,l2,l3(−ωk, tj2))
=
1
mT
n/2
X
k=1
m
X
j1=1
m
X
j2=1
cum4(Dn,l1(−ωk, tj1), Dn,l2(ωk, tj1), Dn,l1(ωk, tj2), DH
n,l2,l3(−ωk, tj2))
+
1
mT
n/2
X
k=1
m
X
j1=1
m
X
j2=1
cum2(Dn,l1(−ωk, tj1), Dn,l2(ωk, tj1))cum2(Dn,l1(ωk, tj2), DH
n,l2,l3(−ωk, tj2))
+
1
mT
n/2
X
k=1
m
X
j1=1
m
X
j2=1
cum2(Dn,l1(−ωk, tj1), Dn,l1(ωk, tj2))cum2(Dn,l2(ωk, tj1), DH
n,l2,l3(−ωk, tj2))
+
1
mT
n/2
X
k=1
m
X
j1=1
m
X
j2=1
cum2(Dn,l1(−ωk, tj), DH
n,l2,l3(−ωk, tj2))cum2(Dn,l1(ωk, tj2), Dn,l2(ωk, tj1))
=
1
mT
n/2
X
k=1
m
X
j1=1
m
X
j2=1
cum2(Dn,l1(−ωk, tj1), Dn,l2(ωk, tj1))cum2(Dn,l1(ωk, tj2), DH
n,l2,l3(−ωk, tj2))
+
1
mT
n/2
X
k=1
m
X
j1=1
m
X
j2=1
cum2(Dn,l1(−ωk, tj1), Dn,l1(ωk, tj2))cum2(Dn,l2(ωk, tj1), DH
n,l2,l3(−ωk, tj2))
+ O( 1
T ) + O( 1
m2)
= 1
4π
Z π
−π
Z 1
0
F GG
l1,l2(u, w)du
 Z 1
0
F GH
l1,l2,l3(u, w)du

+
+
1
4πm
Z π
−π
Z 1
0
F GG
l1,l1(u, w)F GH
l2,l2,l3(u, w)dudω + O( 1
T ) + O( 1
m2).
We then compute cum2(∆l1,l2,l3, ∆l1,l2,l3), the second order cumulant of ∆l1,l2,l3. We begin
26

by considering the quantity:
V1(∆l1,l2,l3) = 1
T 2
X
k1,k2
X
j1,j2
cum2(Dn,l1(−ωk1, tj1)Dn,l2(ωk1, tj1)Dn,l1(ωk1−1, tj1)DH
n,l2,l3(−ωk1−1, tj1),
Dn,l1(−ωk2, tj2)Dn,l2(ωk2, tj2)Dn,l1(ωk2−1, tj2)DH
n,l2,l3(−ωk2−1, tj2)
It suffices to compute the multiplication of cumulant induced by all indecomposable partitions
with size larger than 3 of the following array
1
z
}|
{
Dn,l1(−ωk1, tj1),
2
z
}|
{
Dn,l2(ωk1, tj1),
3
z
}|
{
DH
n,l2,l3(−ωk1−1, tj1),
4
z
}|
{
Dn,l1(ωk1−1, tj1),
5
z
}|
{
Dn,l1(−ωk2, tj2),
6
z
}|
{
Dn,l2(ωk2, tj2),
7
z
}|
{
DH
n,l2,l3(−ωk2−1, tj2),
8
z
}|
{
Dn,l1(ωk2−1, tj2) .
Particularly, the significant terms of cumulant have structures cum4(·)cum2(·)cum2(·) and
cum2(·)cum2(·)cum2(·)cum2(·). By Lemma S4.1 and Lemma S4.2 in van Delft et al. (2021),
the significant terms with the structure cum4(·)cum2(·)cum2(·) can only be included in
arrays with j1 = j2 and induced by one of the following partitions:
{(1256)(34)(78)}, {(1278)(34)(56)}, {(3456)(12)(78)}, {(3478)(12)(56)}.
There are nm2 terms satisfying these two conditions, and each of them is at of order 1
n, which
means that their contribution to the quantity V1(∆l1,l2,l3) is of order O( 1
T 2 ×n2m× 1
n) = O( 1
T ).
The terms with structures cum2(·)cum2(·)cum2(·)cum2(·) that is significant asymptotically
can only be included in arrays with j1 = j2 and |k1 −k2| ≤1 and induced by one of the
following partitions:
{12)(37)(56)(48)}, {12)(36)(78)(45)}, {(15)(26)(34)(78)}, {(18)(27)(34)(56)}
{(15)(26)(37)(48)}.
27

There are at most 3
2(mn) terms satisfying these two restrictions, with each is of order O(1)
and uniformly bounded, hence will contribute to the quantity V1(∆l1,l2,l3) with asymptotic
order O(mn ×
1
T 2) = O( 1
T ). It remains to show that V2(δl1,l2,l3) also vanishes as T →∞,
where
V2(∆l1,l2,l3)
= 1
T 2
X
k1,k2
X
j1,j2,j3,j4
cum2(Dn,l1(−ωk1, tj1)Dn,l2(ωk1, tj1)Dn,l1(ωk1, tj2)DH
n,l2,l3(−ωk1, tj2),
Dn,l1(−ωk2, tj3)Dn,l2(ωk2, tj3)Dn,l1(ωk2, tj4)DH
n,l2,l3(−ωk2, tj4)).
Similar to the discussion of V1(δl1,l2,l3), we calculate the order of cumulants of indecomposable
partitions of arrays:
1
z
}|
{
Dn,l1(−ωk1, tj1),
2
z
}|
{
Dn,l2(ωk1, tj1),
3
z
}|
{
DH
n,l2,l3(−ωk1, tj2),
4
z
}|
{
Dn,l1(ωk1, tj2),
5
z
}|
{
Dn,l1(−ωk2, tj3),
6
z
}|
{
Dn,l2(ωk2, tj3),
7
z
}|
{
DH
n,l2,l3(−ωk2, tj4),
8
z
}|
{
Dn,l1(ωk2, tj4) .
Similar to V1(∆l1,l2,l3), we only need to consider cumulants with structures cum4(·)cum2(·)cum2(·)
and cum2(·)cum2(·)cum2(·)cum2(·). The only significant terms with structure cum4(·)cum2(·)cum2(·)
are the same as V1(∆l1,l2,l3):
{(1256)(34)(78)}, {(1278)(34)(56)}, {(3456)(12)(78)}, {(3478)(12)(56)}.
which is of order O( 1
T ). Similarly, there are O(nm3) significant terms with structure
cum2(·)cum2(·)cum2(·)cum2(·) and of order O(1), which will contribute to V2(∆l1,l2,l3) with
28

asymptotic order 1
T . Consequently, we have
∆l1,l2,l3 = 1
4π
Z π
−π
Z 1
0
F GG
l1,l2(u, w)F GH
l1,l2,l3(u, w)dudω
−1
4π
Z π
−π
Z 1
0
F GG
l1,l2(u, w)du
 Z 1
0
F GH
l1,l2,l3(u, w)du

dω + Op( 1
m) + Op( 1
T ),
(S19)
as claimed.
When H0 holds, (S19) suggests that ∆+ ¯∆= Op( 1
m), and the null distribution is the same
as the null distribution given by Theorem 1 in van Delft et al. (2021). When H1 holds, ∆+ ¯∆
weakly converges to a normal distribution, since UT is asymptotically normally distributed
as shown in the proof to Lemma 1 and ∆+ ¯∆is asymptotically equivalent to a linear
transform of UT. By the asymptotic normality of 4π
T
Pn/2
k=1
Pm
j=1⟨An(ωk, tj), An(ωk−1, tj)⟩+
ˆW −4π
n
Pn/2
k=1⟨1
m
Pm
j=1 An(ωk, tj), 1
m
Pm
j=1 An(ωk, tj)⟩provided by Theorem 1 in van Delft
et al. (2021), and the asymptotic normality of ∆+ ¯∆, we deduce that
√
T(V 2
ˆF −V 2
F ) also
converges to a normal distribution under H1.
S1.7
Theorem 6
Here we show that the estimate given by
ˆσ2
V = 16π2n−1
n/2
X
k=1
 
m−1
m
X
j=1
⟨In(ωk−1, tj), In(ωk, tj)⟩HS
!2
(S20)
converges to the asymptotic variance under H0.
29

Applying the similar technique to establish (S17), we have
 
m−1
m
X
j=1
⟨In(ωk−1, tj), In(ωk, tj)⟩HS
!2
=
 
m−1
m
X
j=1
⟨An(ωk−1, tj), An(ωk, tj)⟩HS
+ m−1
m
X
j=1
⟨An(ωk−1, tj), Bn(ωk, tj)⟩HS + m−1
m
X
j=1
⟨Bn(ωk−1, tj), An(ωk, tj)⟩HS
+ m−1
m
X
j=1
⟨Bn(ωk−1, tj), Bn(ωk, tj)⟩HS + Op(T −1)
!2
.
Noting that Bn(ωk, tj) = Op(T −1/2), we further have
 
m−1
m
X
j=1
⟨In(ωk−1, tj), In(ωk, tj)⟩HS
!2
=
 
m−1
m
X
j=1
⟨An(ωk−1, tj), An(ωk, tj)⟩HS + Op(T −1/2)
!2
=
 
m−1
m
X
j=1
⟨An(ωk−1, tj), An(ωk, tj)⟩HS
!2
+ Op(T −1/2)
and
ˆσ2
V = 16π2n−1
n/2
X
k=1
 
m−1
m
X
j=1
⟨In(ωk−1, tj), In(ωk, tj)⟩HS
!2
= 16π2n−1
n/2
X
k=1
 
m−1
m
X
j=1
⟨An(ωk−1, tj), An(ωk, tj)⟩HS
!2
+ Op(T −1/2).
By Lemma 3.1 in van Delft et al. (2021), the quantity
16π2n−1
n/2
X
k=1
 
m−1
m
X
j=1
⟨An(ωk−1, tj), An(ωk, tj)⟩HS
!2
is a consistent estimate 4π
R π
−π ∥¯FE(ω)∥4
HSdω. Thus ˆσ2
V also converges to 4π
R π
−π ∥¯FE(ω)∥4
HSdω,
30

as desired.
References
Aue, A. & van Delft, A. (2020), ‘Testing for stationarity of functional time series in the
frequency domain’, The Annals of Statistics 48(5), 2505 – 2547.
Brillinger, D. R. (2001), Time series: data analysis and theory, SIAM.
Durrett, R. (2019), Probability: Theory and Examples, Cambridge University Press.
Kendall, W. S. & Le, H. (2011), ‘Limit theorems for empirical Fr´echet means of independent
and non-identically distributed manifold-valued random variables’, Brazilian Journal of
Probability and Statistics 25(3), 323 – 352.
Moakher, M. (2005), ‘A differential geometric approach to the geometric mean of symmetric
positive-definite matrices’, SIAM Journal on Matrix Analysis and Applications 26(3), 735–
747.
Taylor, M. (2010), Partial Differential Equations I: Basic Theory, Applied Mathematical
Sciences, Springer New York.
van Delft, A., Characiejus, V. & Dette, H. (2021), ‘A nonparametric test for stationarity in
functional time series’, Statistica Sinica 31(3), pp. 1375–1395.
Wu, W. B. (2005), ‘Nonlinear system theory: Another look at dependence’, Proceedings of
the National Academy of Sciences 102(40), 14150–14154.
Zhou, Z. (2013), ‘Heteroscedasticity and autocorrelation robust structural change detection’,
Journal of the American Statistical Association 108(502), 726–740.
31
